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POLY ZINC FINGER PROTEINS WITH IMPROVED LINKERS 

V 

CROSS-REFERENCES TO RELATED APPLICATIONS 
This application claims priority from USSN 60/076.454. filed March 2, 
199S. herein incorporated by referenrp -n its entirety. 

STATEMENT AS TO INVENTIONS MADE UNDER 
FEDERALLY SPONSORED RESEARCH .AND DEVELOPMENT 

Work described herein was supported by grants PO1-CA42063, CDR- 
SS03014 and P30-CA 14051 from the National Institutes of Health. National Science 
Foundation and National Cancer Institute, respectively. The U.S. Government has certain 
rights in the invention. Work described herein was also supported by the Howard Hughes 
Medical Institute. 

BACKGROUND OF THE INVENTION 
Zinc fingers belonging to the Cys2-HiS2 family constitute one of the most 
common DNA-binding motifs found in eukaryotes, and these zinc fingers have provided 
a very attractive framework for the design and selection of DNA-binding proteins with 
novel sequence specificities. Numerous studies have used phage display methods or 
design ideas to explore and systematically alter the specificity of zinc fmger-DNA 
interactions (Desjarlais & Berg, Proteins Struct. Funct. Genet. 72:101-104 (1992); 
Desjarlais & Berg, Proc. Natl. Acad. Sci. USA 90:2256-2260 (1993); Rebar & Pabo, 
Science 265:671-673 (1994); Jamieson et a/., Biochemistry 52:5689-5695 (1994); Choo & 
Klug, Proc. Natl. Acad. Set. USA 97:1 1 163-1 1 167 (1994); Wu et a/.. Proc. Nad. Acad. 
Sci. USA 92:344-348 (1995 ); and Greisman & Pabo, Science 275:657-661 (1997)). 

;;yond proteins that recognize extended sites (Pomerantz et a/.. Science 267:93-96 (1995); 
Kim ei at., Proc. Natl. Acad. Sci. USA 94:3616-3620 (1997)). For example, zinc finger 
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proteins have been linked to a GAL4 dimenzation domain to develop novel homo- and 
hetero-dimers (Pomerantz et aL, Biochemistry 4:965-970 (1997)), and tO a nuclease 
domain to generate novel restriction enzymes (Kim et aL, Proc. Natl. Acad. Sci. USA 
93A 156-1 160 (1996)). zinc finger/homeodomain fusion is being tested for potential 
5 applications in gene therapy (Rivera et aL, Nature Med. 2\ 1028- 1032 (1996)). 

There also have been several attempts to increase affinity and specificity 
of zinc finger proteins by adding additional fingers to a three-finger protein (Rebar, 
(Ph.D. Thesis), Selection Studies of Zinc Finger-NA Recognition, Massachusetts Institute 
of Technology (1997); Shi, Y. (Ph.D. Thesis) Molecular Mechanisms of Zinc Finger 

10 Protein-Nucleic Acid Interactions, Johns Hopkins University (1995)) or by tandemiy 
linking two three-finger proteins (Liu et aL, Proc. Nail Acad. Sci USA 94:5525-5530 
i 199")). however, these previous design strategies for poly-finger proteins, which all 
used canonical "TGEKLP" linkers (linkers having the ammo acid sequence threonine- 
glvcine-glutamate-lysine-proiine) to connect the additional fingers, resulted in relatively 

15 modest increases in affinity. There is thus a need to develop linkers that provide 
enhanced affinity and specificity to chimeric zinc finger proteins. 

SUMMARY OF THE INVENTION 
The present invention therefore provides a method of using structure based 

10 design to select flexible linkers and make chimeric zinc finger proteins with enhanced 

affinity and specificity. The present invention also provides a method of making chimeric 
zinc finger proteins that have flexible linkers of 5 amino acids or more in length to make 
chimeric zinc finger proteins with enhanced affinity and specificity. Zinc finger proteins 
made using these methods have binding affinities in the femtomolar range and provide, 

25 e.g., high levels (more than about 70 fold) of transcriptional repression at a single target 
site. Such zmc finger proteins can be used for regulation of gene expression, e.g., as 
therapeutics, diagnostics, and for research applications such as functional genomics. 

In one aspect, the present invention provides a method of making a 
chimeric zinc finger protein that binds to adjacent target sites, the method comprising the 

30 steps of: (i) selecting a first and a second DNA-binding domain polypeptide of the 

chimeric zinc finger protein, wherein at least one of the domains comprises a zinc finger 
polypeptide, and wherein the first domain binds to a first target site and the second 
domain binds to a second target site, which target sites are adjacent; (ii) using structure- 
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based desisn to determine the physical separation between the first and second domains 
when they are individually bound to the first and second target sites; (in) selecting a 
flexible linker that is at least 1-2 A longer than the physical separation between the first 
and second domains; and (iv) fusing the first and second domains with the flexible linker, 
5 therebv making a chimeric zinc finger protein that binds to adjacent target sites. 

In another aspect, the present invention provides a method of making a 
chimeric zinc finger protein that binds to adjacent target sites, the method comprising the 
steps of: (i) selecting a first and a second DNA-bmding domain polypeptide of the 
chimeric zinc finger protein, wherein at least one of the domains comprises a zinc ungci 

10 polypeptide, and wherein the first domain binds to a first target site and the second 
domain binds to a second target site, which target sites are adjacent; (ii) selecting a 
flexible ".inker that is five or more amino acids in length; and (iv) fusing the first and 
second domains with the flexible linker, thereby making a chimeric zinc finger protein 
that binds to adjacent target sites. 

1 5 In another aspect, the present invention provides a chimeric zinc finger 

protein that binds to adjacent target sites, the chimeric zinc finger protein comprising: (i) 
a first and a second DNA-bmding domain polypeptide of the chimeric zinc finger protein, 
wherein at least one of the domains comprises a zinc finger polypeptide, and wherein the 
first domain binds to a first target site and the second domain binds to a second target site, 

20 which target sites are adjacent; and (n) a flexible linker that is at least 1-2 A longer than 
the physical separation between the first and second domains when they are individually 
bound to the first and second target sites, as determined by structure-based modeling; 
wherein the first and second domains are fused with the flexible linker. 

In another aspect, the present invention provides a chimeric zinc finger 

25 protein that binds to adjacent target sites, the chimeric zinc finger protein composing: (1) 
a first and a second DNA-binding domain polypeptide of the chimeric zinc finger protein, 
wherein at least one of the domains comprises a zinc finger polypeptide, and wherein the 
first domain binds to a first target site and the second domain binds to a second target site, 
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the chimeric zinc finger proteins. 
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In one embodiment, the first and the second domains are zinc finger 
polypeptides. In another embodiment, the zinc finger polypeptide is selected from the 
group consisting of Zif268 and NRE. In another embodiment, the zinc finger 
polypeptides are heterologous. In one embodiment, the first domain is a zinc finger 
polypeptide and the second domain comprises a heterologous DNA-binding domain 
polypeptide. In another embodiment, the chimeric zinc finger protein further comprises a 
regulatory domain polypeptide. 

In one embodiment, the chimeric zinc finger protein has femtomoiar 
affinity for :he adjacent target sites. In another embodiment, the chimeric zinc finger 
protein has about 2-4 femtomoiar affinity for the adjacent target sites. 

In one embodiment, the flexible linker is 5, 3, or 1 1 amino acids in length. 

In another embodiment, the flexible linker has the sequence RQKDGERP or 

RQKDGGGSERP. 

In one embodiment, the target sues are separated by one or two 

nucleotides. 

In one embodiment, the adjacent target sues are separated by zero 
nucleotides and the flexible linker is five or six amino acids in length. In another 
embodiment, the adjacent target sites are separated by one nucleotide and the flexible 
linker is seven, eight, or nine amino acids in length. In another embodiment, the adjacent 
target sites are separated by two nucleotides and the flexible linker is ten, eleven, or 
twelve amino acids in length. In another embodiment, the adjacent target sites are 
separated by three nucleotides and the flexible linker is twelve or more amino acids in 
length. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure I depicts structure-based design of a six finger peptide, 268//NRE. 
The cocrystal structure of the Zif268-DNA complex and the template B-DNA (used at the 
junction) were aligned by superimposing phosphates (Pavletich & Pabo, Science 252:809- 
817 (1991); Elrod-Erickson et al. Structure 4:1 171-1 180 (1996)). In this model, two 
three-finger peptides bind to corresponding 9-bp sites (bases shown in white) separated 
by a 2 bp gap (bases shown in gray). Note that the onentation of one three-finger peptide 
almost exactly matches that of the other three finger peptide because one helical turn of 
this underwound DNA contains 1 1 bp. 
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Figure 2 depicts schematic representations of zinc finger peptides and of 
reporter constructs used m transfection studies described herein. Figure 2A shows zinc 
finger peptides. Each finger is represented with a circle. The ammo acid sequence of a 
linker in the Zif268 peptide (which has a canonical "TGEKP" linker) is shown, and 

5 lonser linkers used to connect the three-finger peptides are indicated below. In each case, 
the box on ihe left denotes the helical region and includes the second of the conserved His 
residues of the finger: the zigzag line denotes the first (3-sheet of the next finger, which 
includes the first of the conserved Cys residues. Figure 2B illustrates promoters of 
iucifensf- reporter senes. The nucleotide positions of the TATA box, the start codon, and 

10 zinc finger binding sites are numbered with respect to the transcription start site (+1). 

Figure 3 depicts a gel shift assay, various amounts (0, 0.01 , 0.1 , and I 
:iM) of the NRE peptide were incubated for 1 hour with free binding sues (lanes 1-4) or 
binding sites preincubated with 0.1 nM of the Zif268 peptide for 0.5 hours (lanes 5-8). 
The positions of the free DNA and the protem-DNA complexes are indicated. 

1 5 Figure 4 depicts competition binding studies. In Figure 4A, the 268//NRE 

peptide (5 pM) was preincubated with various amounts (0.05, 0.5, 5 and 50 nM) of cold 
competitor DNAs (lanes 3-14) for 1 hour, and then a slight molar excess (over the peptide 
concentration) of the labeled N/Z site ( 608 pM) was added to the reaction mixture. 
Aiiauots were analyzed bv gel electrophoresis at various time points, and this gel shows 

2 ; : the results after 600 hours of incubation time at room temperature. In Figure 4B, the 

26S/.NRE (lanes 2-6) or Zif268 peptide (lanes 7-11) was mixed with the labeled N/Z site , 
a sliaht moiar excess (over the peptide concentration) of unlabeled N/Z site was added (so 
that "0% of the labeled site would be shifted in the absence of salmon sperm DNA), and 
various amounts of salmon sperm DNA (0, 0.1, 1, 10, and 100 ng/'ml) were included. 

25 Samples were analyzed by gel electrophoresis after 24 hours of incubation. 

Figure 5 depicts graphs (Figures 5A, 5B, 5C, and 5D) illustrating 
transcriptional repression in vivo by zinc finger peptides. Human 293 cells were 
transfected as described (Cepek et a/., Genes Dev. 70:2079-2088 (1996)) using the 



d Trv 



1 1 1 



; \. C iJ i C i i 1 ^ 1 1 



C i ^ \ i u . ^ 



repression) were obtained by dividing 1) the relative luciferase activities from the cells 
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transfected with the empty expression plasmid by 2) those from the cells transfected with 
zmc finger expression plasmids. Different scales are used in graphs for the different 
reporters. The 6 8 /'NR. 68/NRE, 68//NR, and 68//NRE peptides are variants of six-fmger 
fusion proteins that are missing one or two of the terminal fingers. Thus the 68/NR 
peptide contains fingers 2 and 3 of the Zif268 peptide fused (via the shorter of the two 
linkers) to fingers 1 and 2 of the NRE peptide. The data represent an average of three 
independent experiments, and the standard error of the mean is shown. 

DETAILED DESCRIPTION OF THE INVENTION 

I. Introduction 

The present invention provides a design strategy for linkers that fuse two 
DNA binding domains of a chimeric zinc finger protein. These linkers are flexible and 
longer than the canonical linkers previously used, allowing binding of the chimeric zinc 
finger protein to its target site without introducing any strain. The target site is typically a 
"composite" target site," composed of two adjacent target sues that are separated by zero 
:o 5 or more nucleotides. Each of the adjacent target sites is recognized by one DNA- 
binding domain of the chimeric zinc finger protein. The linker design strategy involves 
structure-based design to determine a minimum length for a linker between two DNA- 
binding domains, and then adding additional amino acids to the linker to provide at least 
about 1-2 additional angstroms of flexibility to the linker. The present invention thus 
provides chimeric zinc finger proteins with femtomolar affinity for their target site, and 
w hich effectively repress gene expression, e.g., more than about 70 fold, when targeted to 
a single site. 

Structural and biochemical analyses show that DNA often is slightly 
unwound when bound to zinc finger peptides (Pavletich & Pabo, Science 252:809-817 
(1991); Shi & Berg, Biochemistry 35:3845-3848 (1996); Nekludova & Pabo, Proc. Natl. 
Acad. Set. USA 91 :6948-6952 (1994)). Modeling studies have shown that on ideal B 
DNA, the canonical linker is a bit too short to allow favorable docking of Zif268 (Elrod- 
Enckson et al M Structure 4:1171-1180 (1996)); the DNA must be slightly unwound to 
interact with zinc fingers in the mode seen in the Zif268 complex. Essentially, it appears 
that the helical periodicity of the zinc fingers does not quite match the helical periodicity 
of B-DNA. Since the strain of unwinding may become a more serious problem when 
there are more fingers (the helical periodicities of the peptide and DNA may get 



WO 99/45132 PCT/US99/04441 

1 

progressively further out of phase), longer, more flexible linkers were tested in the design 
of poly-ringer proteins (see Kim & Pabo, Proc. Nat'lAcad. Sex. U.S.A. 95:2812-2817 
(1998), herein incorporated by reference in its entirety). 

The present invention demonstrates that linkers of 5 ammo acids or more 
5 can be used to make chimeric zinc finger proteins with enhanced affinity. For example, a 
linker of 8 amino acids was used for a chimeric zinc finger protein that recognized 
adjacent target sites separated by one base pair. A iinker of 1 1 ammo acids was used for 
a chimcnc zinc finger protein that recognized adjacent target sites separated by two base 
pairs. The linkers of the invention can also be designed using structure-based modeling. 

10 In structure-based modeling, a model is made that shows the binding of each DNA 

binding domain polypeptide to its DNA target site. The model is then used to determine 
the physical separation of the domains as they are bound to adjacent target sites. The 
physical separation between the domains is used to determine the minimum length of the 
linker used to connect the C-terminal amino acid of the first domain with the N-termmal 

1 5 ammo acid of the second domain, without steric hindrance to the linker or the DNA 
binding domains. This length is then increased by 1-2 A, to create a slightly longer, 
flexible linker that avoids introducing strain to the chimeric zinc finger protein. 

Often computer programs are used for structure-based modeling, although 
the models can also be made physically. Examples of computer programs used for 

20 structure-based modeling include Insight II (Biosym Technologies, SanDiego) and 
Quanta 4.0 (Molecular Simulations (Burlington, MA). The programs often use 
information derived from x-ray crystallographic studies of DNA-binding proteins to 
provide the appropriate coordinates for proteins. This information can also be obtained 
from oubliclv available databases such as the Brookhaven Protein Data Bank. This 

25 information can also be used to extrapolate distances and coordinates for DNA binding 
proteins whose crystal structure is unknown. Models of B DNA are well known in the 
art. The relevant coordinates (e.g., distances and sizes) are used with computer modeling 
program of choice, using the manufacture's instructions and default parameters. 



• 'V.. • <\ it'ticn ,v r'ano. .Helena ... : J y<)^- V; \ K *^\ ) Rrhnr Kn " ) mpms 
(Massachusetts Institute of Technology, Cambridge MA) (1997); Liu et a/., Proc, Nat 7 
Acad. Sci. U.S.A. 94:5525-5530 (1 997); Pomerantz et ai. Science 267:93-96 (\ 9951. 
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Pomerantz et aL, Proc. Nat 7. Acad. Sa. U.S.A. 92:9752-9756 (1995); Li et aL Nature 
Biotechnology 16:190-195 (1998); Kirn et aL. Proc. Natl. Acad. Set. USA 94:3616-3620 
(1997); and Pomerantz et aL. Biochemistry 4:965-970 (1997), herein incorporated by 
reference in their entirety). Two basic criteria suggest which alignments of DNA-binding 
domains have potential for combination in a chimeric protein which binds DNA: ( 1 ) lack 
of collision between domains, and (2) consistent positioning of the carboxyl- and ammo- 
terminal regions of the domains, i.e.. the domains are oriented such that the carboxyl- 
terminal region of one polypeptide can be joined to the amino-terminal region of the next 
polypeptide 

The linker used to link the two DNA-binding domains can comprise any 
ammo acid sequence that does not substantially hinder interaction of the DNA-binding 
domains with their respective target sues. Preferred amino acid residues for linkers of the 
present invention include, but are not limited to glycine, alanine, leucine, serine, valine 
ana threomne. Once the length of the amino acid sequence has been selected, the 
sequence of the linker can be selected, e.g., by phage display library technology {see, e.g., 
U.S. Patent No. 5,260,203), or using naturally occumng or synthetic linker sequences as a 
scaffold (e.g.. GTGQKP and GEKP, see Liu et aL, Proc. Nat 'I Acad. Sci. U.S.A. 94:5525- 

5 530 (1997); see also Whitlow et aL. Methods: A Companion to Methods in Enzymology 
2:97-105 (1991)). Typically, the linkers of the invention are made by making 
recombinant nucleic acids encoding the linker and the DNA-binding domains, which are 
fused via the linker amino acid sequence. Optionally, the linkers can also be made using 
peptide synthesis, and then linked to the polypeptide DNA-binding domains. 

The chimeric zinc finger proteins of the invention are composed of two or 
more DNA-binding domains, where at least one of the DNA binding domains is a zinc 
finger polypeptide. The second DNA binding domain can be a zinc finger binding 
domain, either the same domain or a heterologous domain. Suitable zinc finger proteins 
include any protein from the Cys2-His2 family, e.g., SP-1, SP-1C, ZIF268, NRE, 
Tramtrack, GLI, YYi, or TFIIIA (see, e.g., Jacobs, EMBO 7. 1 1:4507 (1992); Desjarlais 

6 Berg, PNAS 90:2256-2260 (1993); Chnsty etaL, PNAS 85:7857-7861 (1988); 
Greisman & Pabo, Science 275:657-661 (1997); Fairall et aL. Nature 366:483 (1993); 
Paveltich etaL. Science 261 :1 701 (1993)). 

The second DNA binding domain can also be a heterologous DNA binding 
domain, e.g., from a restriction enzyme; a nuclear hormone receptor, a homeodomain 
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protein or a heiix mm helix motif protein such as MAT 1. MAT 2, MAT al, 
Antennapedia, Ultrabithorax, Engrailed, Paired, Fushi tarazu. HOX, Unc86, Octl, Oct2, 
Pi;, lambda repressor and tet repressor; Gal 4; TATA binding protein; helix loop helix 
motif proteins such as myc, myo D, Daughterless, Achaete-scute (T3), El 2, and E47; 
5 leucine zipper type proteins such as GCN4, C/EBP, c-Fos/c-Jun and JunB; and beta sheet 
motif proteins such as met. arc, and mnt repressors. In another embodiment, the zinc 
finger protein is linked to at least one or more regulatory domains, described below. 
Preferred regulatory domains include transcription factor repressor or activator domains 
such as KRAB and VP 16, to-repressor and co-aciivator domains, DNA methyi 
1 0 transferases, histone acetyltransferases, histone deacetylases, and endonucleases such as 
Fokl. The amino acid sequences of the DNA-binding domains may be naturally- 
occurring o: non-naturally-occurring (or modified). 

The expression of chimeric zinc finger proteins can be also controlled by 
systems typified by the tet-regulated systems and the RU-486 system (see. e.g., Gossen & 
15 Buiard. PNAS 89:5547 (1992); Oligino etal, GeneTher. 5:491-496 (1998); Wang et at., 
Gene Ther. 4:432-441 (1997); Neering et al, Blood 88:1 147-1 155 (1996); and Rendahl et 
a!., Nat. BiotechnoL 16:757-761 (1998)). These impart small molecule control on the 
expression of the chimeric zinc finger protein and thus impart small molecule control on 
the rarset eene(s) of interest. This beneficial feature could be used in cell culture models. 
20 in gene therapy, and in transgenic animals and plants. 

The binding specificity of the chimeric DNA-binding proteins makes them 
particularly useful because they have DNA-binding properties distinct from those of 
known proteins. The chimeric proteins prefer to bind the adjacent target sites and, thus, 
can be used to modulate expression of genes having the adjacent target sites. These 
chimeric zinc finger proteins have an affinity for the adjacent target sites that is in the 
femtomolar range, e.g., 100 femtomoles, 10 femtomoles, or less, in some cases as low as 
2-4 femtomoles, and in some cases 1 femtomolar or lower. 

The zinc finger proteins made using the method of the invention have 
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therapeutic applications, e.g., treatment of genetic diseases, cancer, fungal, protozoal, 
bacteria!, and viral infection, ischemia, vascular disease, arthritis, immunological 
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disorders, etc., as well as providing means for developing plants with altered phenotypes, 
including disease resistance, fruit npening, sugar and oil composition, yield, and color, in 
addition, the zinc finger proteins of the present invention can be used for diagnostic 
assays and for functional genomics assays. 

As described herein, zinc finger proteins can be designed to recognize any 
suitable target site for any of the uses described herein, e.g., eukaryotic and prokaryotic 
eenes. cellular genes, viral genes, protozoal genes, fungal genes, and bacterial genes. In 
general, suitable genes to be regulated include cytokines, iymphokines, growth factors, 
mitogenic factors, chemotactic factors, onco-active factors, receptors, potassium 
channels, G-proteins, signal transduction molecules, and other disease-related genes. 

A general theme in transcription factor function is that simple binding and 
sufficient proximity to the promoter are all that is generally needed. Exact positioning 
relative to the promoter, orientation, and within limits, distance do not matter greatly. 
This feature allows considerable flexibility in choosing sites for constructing zinc finger 
proteins. The target site recognized by the zinc finger protein therefore can be any- 
suitable sue in the target gene that will allow activation or repression of gene expression 
by a zinc finger protein, optionally linked to a regulatory domain. 

Preferred target sites include regions adjacent to, downstream, or upstream 
of the transcription start site. In addition, target sites that are located in enhancer regions, 
repressor sues, RNA polymerase pause sites, and specific regulatory sites (e.g., SP-1 
sites, hypoxia response elements, nuclear receptor recognition elements, p53 binding 
sites), sites in the cDNA encoding region or in an expressed sequence tag (EST) coding 
region. As described below, typically each finger recognizes 2-4 base pairs, with a two 
finger zinc finger protein binding to a 4 to 7 bp target site, a three finger zinc finger 
protein binding to a 6 to 10 base pair site, and a six finger zinc finger protein binding to 
two adjacent target sites, each target site having from 6-10 base pairs. 

Chimeric zinc finger proteins of the invention can be tested for activity in 
vivo using a simple assay (Current Protocols in Molecular Biology (Ausubel ex al., eds, 
1994)). The in vivo assay uses a plasmid encoding the chimeric zinc finger protein, 
which is co-expressed with a reporter plasmid containing a test gene, e.g., the luciferase 
gene, the chloramphenicol acetyl transferase (CAT) gene or the human growth hormone 
(hGH) gene, with a target site for the chimeric zinc finger protein. The two plasmids are 
introduced together into host cells. A second group of cells serves as the control group 
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and receives a plasmid encoding the transcription factor and a plasmid containing the 
reporter gene without the binding sue for the transcription factor. 

The production of reporter gene transcripts or the amount of activity of the 
relevant protein is measured; if mRNA synthesis from the reporter gene or the amount of 

5 activity of the relevant protein is greater than that of the control gene, the transcription 
factor is a positive regulator of transcnption. If reporter gene mRNA synthesis or the 
amount of activity of the relevant protein is less than that of the control, the transcnption 
factor is a negative regulator of transcnption. 

Opuonaiiy, the assay may include a transfection efficiency control 

10 plasmid. This plasmid expresses a gene product independent of the reporter gene, and the 
amount of this gene product indicates roughly how many cells are taking up the plasmids 
and how efficiently the DNA is being introduced into the cells. The chimenc zinc finger 
protein can also be tested for modulation of an endogenous gene in vivo, using methods 

known to those of skill in the art. 

1 5 In one embodiment, the present invention provides a fusion in which the 

three-finger Zif268 peptide was linked to a designed three-finger peptide (designated 
"XRE" 1 ) that specifically recognizes a nuclear hormone response element (Greisman & 
Pabo, Science 275:657 (1997)). Gel shift assays indicate that this six-finger peptide, 
268/TvRE. binds to a composite 18 bp DNA site with a dissociation constant in the 

20 femtomolar range. The slightly longer linkers used in this fusion protein provide a 

dramatic improvement in DNA-bmding affinity, working much better than the canonical 
"TGEKP" linkers that have been used in previous studies. Tissue culture transtection 
experiments also show that the 268//NRE peptide is an extremely effective repressor, 
giving "2-fold repression when targeted to a binding site close to the transcription start 

25 site. Using this strategy and linking peptides selected via phage display allows the design 
of novel DNA-binding proteins with extraordinary affinity and specificity for use in 
biological applications and gene therapy. 

The new six-finger peptides bind far more tightly than previously reported 



of Zinc Finger-DNA Recognition, Massachusetts Institute of Technology (1997)), by Shi 
(Shi. <'Ph O Thesis). Molecular Mechanisms of Zinc Finder Protein-Nucleic Acid 
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Interactions, Johns Hopkins University (1995)), and by Liu et ai. (Liu et ai, Proc. Natl, 
Acad. Sa. USA 94:5525-5530 (1997)). Each study compared binding of the new poly- 
. finger protein (at the appropriate extended site) with binding of the original three-finger 
peptide. Using canonical linkers, a four-finger peptide bound 6.3 times more tightly than 
the corresponding three-finger peptide (Rebar ( Ph.D. Thesis), Selection Studies of Zinc 
Finger-DNA Recognition, Massachusetts Institute of Technology (1997)), a five-finger 
construct showed no improvement in Kd over the original three-finger peptide (Shi, 
(Ph.D. Thesis). Molecular Mechanisms of Zinc Finger Protein-Nucleic Acid Interactions, 
Johns Hopkins University (1995)), and six-finger peptides bound 58-74-fold more tightly 
than the corresponding three-finger peptides (Liu et ai, Proc. Natl. Acad. Sa. USA 
94:5525-5530 (1997)). 

In contrast, the peptides described herein (see Example section) bind 
6,000-90,000-fold more tightly than the original three-finger peptides. It seems likely 
that the longer linkers used in the 268/NRE and 268//NRE constructs must relieve some 
strain that accumulates when a larger set of fingers all are connected with canonical 
linkers,, Presumably this involves a slight mismatch in the helical periodicity of the DNA 
and the preferred helical periodicity of the zinc fingers, causing them to fall slightly out of 
register, particularly when 4 or more fingers are connected via canonical linkers. 

II. Definitions 

As used herein, the following terms have the meanings ascribed to them 
unless specified otherwise. 

The term "zinc finger protein" or 4i ZFP" or "zinc finger polypeptide" 
refers to a protein having DNA binding domains that are stabilized by zinc. The 
individual DNA binding domains are typically referred to as "fingers" A zinc finger 
protein has least one finger, typically two fingers, three fingers, or six fingers. Each 
finger binds from two to four base pairs of DNA, typically three or four base pairs of 
DNA (the "subsite"). A zinc finger protein binds to a nucleic acid sequence called a 
target site or target segment. Each finger typically comprises an approximately 30 amino 
acid, zinc -chelating, DNA -binding subdomain. An exemplary motif characterizing one 
class of these proteins (C 2 H 2 class) is -Cys-(X) 2 ^-Cys-(X) I2 -His-(X) 3 .5-His (where X is 
any amino acid). Studies have demonstrated that a single zinc finger of this class consists 
of an alpha helix containing the two invariant histidine residues co-ordinated with zinc 
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along with the two cysteine residues of a single beta turn (see. e.g.. Berg & Shi, Science 
271:1081-1085 (1996)). 

A "chimeric" zinc finger protein refers to a protein that has at least two 
DNA-binding domains, one of which is a zinc finger polypeptide, linked to the other 
domain via a flexible linker. The two domains can be the same or heterologous. Both 
domains can be zinc finger proteins, either the same zinc finger protein or heterologous 
zinc finger proteins. Alternatively, one domain can be a heterologous DNA-binding 
protein. 

A "target site" is the nucleic acid sequence recognized by a zinc finger 
protein or a heterologous DNA-binding polypeptide. For a zinc finger protein, a single 
target sue typically has about four to about ten base pairs. Typically, a two-fingered zinc 
finaer nrotem recognizes a four to seven base pair target site, a three- fingered zinc linger 
protein recognizes a six to ten base pair target site, and a six fingered zinc finger protein 
recognizes two adjacent nine to ten base pair target sites. 

A "subsite" is a subsequence of the target site, and corresponds to a 
portion of the target site recognized by a single finger, e.g., a 2-4 base subsite, typically a 
3 base subsite. A target site comprises at least two, typically three, four, five, six or more 
subsites, one for each finger of the protein. 

The term "adjacent target sites" refers to non-overlapping target sites that 
are separated by zero to about 5 base pairs. 

The "physical separation" between two DNA-binding domains refers to 
the distance between two domains when they are bound to their respective target sites. 
This distance is used to determine a minimum length of a linker. A minimum length of a 
linker is the length that would allow the two domains to be connected without providing 
steric hindrance to the domains or the linker (a minimum linker). A linker that provides 
more than the minimum length is a "flexible linker." 

"Structure based design' 1 refers to methods of determining the length of 
minimum linkers and flexible linkers, using physical or computer models of DNA- 

:\j^:\::\i T , l or, :v jomnounn \ c ,T i *>r ti nger nrotem ; tha:g:v et, nai: maxima; 
of the compound to its target (i.e., half of the compound molecules are bound to the 
target) under given conditions (i.e.. when [target] « K<<), as measured using a giver. assa\ 
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system {see. e.g., U.S. Patent No. 5,789,538). The assay system used to measure the 
should be chosen so that it gives the most accurate measure of the actual Kj of the zinc 
finger protein. Any assay system can be used, as long is it gives an accurate measurement 
of the actual of the zinc finger protein. In one embodiment, the K<j for the zinc finger 
proteins of the invention is measured using an electrophoretic mobility shift assay 
("EMSA"), as described in herein. Unless an adjustment is made for zinc finger protein 
purity or activity, K<j calculations may result in an underestimate of the true of a given 
zinc finger protein. 

The phrase "adjacent to a transcription initiation site' 1 refers to a target sue 
that is within about 50 bases either upstream or downstream of a transcription initiation 
site. "Upstream" of a transcription initiation site refers to a target site that is more than 
about 50 bases 5' of the transcription initiation site (i.e., in the non-transcribed region of 
the gene). 

The phrase "RNA polymerase pause site" is described in Up tain et ai. 
Annu. Rev. Biochem. 66:117-172 (1997). 

The term "heterologous" is a relative term, which when used with 
reference to portions of a nucieic acid indicates that the nucleic acid comprises two or 
more subsequences that are not found in the same relationship to each other in nature. 
For instance, a nucleic acid that is recombmantly produced typically has two or more 
sequences from unrelated genes synthetically arranged to make a new functional nucleic 
acid, e.g., a promoter from one source and a coding region from another source. The two 
nucleic acids are thus heterologous to each other in this context. When added to a cell, 
the recombinant nucleic acids would also be heterologous to the endogenous genes of the 
cell. Thus, in a chromosome, a heterologous nucleic acid would include an non-native 
(non-naturally occurring) nucleic acid that has integrated into the chromosome, or a non- 
native (non-naturally occurring) extrachromosomal nucleic acid. In contrast, a naturally 
translocated piece of chromosome would not be considered heterologous in the context of 
this patent application, as it comprises an endogenous nucleic acid sequence that is native 
to the mutated cell. 

A "regulatory domain" refers to a protein or a protein domain that has an 
activity such as transcriptional modulation activity, DNA modifying activity, protein 
modifying activity and the like when tethered to a DNA binding domain, i.e., a zinc 
finger protein. Examples of regulatory domains include proteins or effector domains of 
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proteins, e.g., transcription factors and co-factors [e.g., KRAB, MAD, ERD, SID, nuclear 
factor kappa B subunit p65, early growth response factor 1, and nuclear hormone 
receptors. VP 16, VP64), endonucleases, integrases, recombinases, methyltransferases, 
histone acetvltransferases, histone deacelylases etc. Activators and repressors include co- 
5 activators and co-repressors (see, e.g., Utley et al, Nature 394:498-502 (1998)). 

A "heterologous DNA-binding domain" refers to a DNA binding domain 
from a protein that is not a zinc finger protein, such restriction enzyme, a nuclear 
hormone receptor, a homeodomam protein such as engrailed or antenopedia. a bacterial 
helix tum helix motif protein such as lambda repressor and tet repressor, Gal 4, TATA 
1 0 binding protein, helix loop helix motif proteins such as myc and myo D, leucine zipper 
type proteins such as fos and jun. and beta sheet motif proteins such as met, arc. and mnt 
repressors. 

"Humanized" refers to a non-human polypeptide sequence that has been 
modified to minimize immunoreactivity in humans, typically by altering the amino acid 

1 5 sequence to mimic existing human sequences, without substantially altering the function 
of the polypeptide sequence (see, e.g., Jones et ai, Nature 321:522-525 (1986), and 
published UK patent application No. S707252). Backbone sequences for the zinc finger 
proteins are preferably be selected from existing human C:H 2 zinc finger proteins (e.g., 
SP- 1). Functional domains are preferably selected from existing human genes, (e.g., the 

20 activation domain from the p65 subunit of NF-kB). Where possible, the recognition helix 
sequences will be selected from the thousands of existing zinc finger protein DNA 
recognition domains provided by sequencing the human genome. As much as possible, 
domains will be combined as units from the same existing proteins. All of these steps 
will minimize the introduction of new junctional epitopes in the chimeric zinc finger 

25 proteins and render the engineered zinc finger proteins less immunogenic. 

"Nucleic acid" refers to deoxyribonucleotides or ribonucleotides and 
oolvmers thereof in either sinele- or double-stranded form. The term encompasses 
nucleic acids containing known nucleotide analogs or modified backbone residues or 
linkages, which are svnthetic. naturallv occurrine. and non-naturalW occurring, which 



. . ......... ,.;iu.a, v .... .c^ncei. i:\ampicb ,,i — :. anaio^ mciuuu 

without limitation, phosphorothioatcs, phosphoramidates, methyl phosphonates, chiral- 
mcthyl phosphonates, 2-O-methyl ribonucleotides, peptide-nucleic acids (PNAs). Unless 
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otherwise indicated, a particular nucleic acid sequence also implicitly encompasses 
conservatively modified variants thereof (e.g., degenerate codon substitutions) and 
complementary sequences, as well as the sequence explicitly indicated. The term nucleic 
acid is used interchangeably with gene, cDNA, mRNA, oligonucleotide, and 
polynucleotide. The nucleotide sequences are displayed herein in the conventional 5'-3' 
orientation. 

The terms "polypeptide," "peptide' 1 and "protein" are used interchangeably 
herein to refer to a polymer of amino acid residues. The terms apply to amino acid 
polymers in which one or more amino acid residue is an analog or mimetic of a 
corresponding naturally occurring amino acid, as well as to naturally occurring amino 
acid polymers. Polypeptides can be modified, e.g., by the addition of carbohydrate 
residues to form glycoproteins. The terms ■'polypeptide/' "peptide" and "protein" include 
glycoproteins, as well as non-glycoproteins. The polypeptide sequences are displayed 
herein in the conventional N-terminal to C-terminal orientation. 

The term "amino acid" refers to naturally occurring and synthetic amino 
acids, as well as ammo acid analogs and amino acid mimetics that function in a manner 
similar to the naturally occurring amino acids. Naturally occurring amino acids are those 
encoded by the genetic code, as well as those amino acids that are later modified, e.g., 
hydroxyproline, carboxygiutamate, and O-phosphosenne. Ammo acid analogs refers to 
compounds that have the same basic chemical structure as a naturally occurring ammo 
acid, i.e., an a carbon that is bound to a hydrogen, a carboxyl group, an amino group, and 
an R group, e.g., homoserine, norieucine, methionine sulfoxide, methionine, and methyl 
sulfonium. Such analogs have modified R groups (e.g., norieucine) or modified peptide 
backbones, but retain the same basic chemical structure as a naturally occumng amino 
-acid. Amino acid mimetics refers to chemical compounds that have a structure that is 
different from the general chemical structure of an amino acid, but that functions in a 
manner similar to a naturally occurring amino acid. 

"Conservatively modified variants" applies to both amino acid and nucleic 
acid sequences. With respect to particular nucleic acid sequences, conservatively 
modified variants refers to those nucleic acids which encode identical or essentially 
identical amino acid sequences, or where the nucleic acid does not encode an amino acid 
sequence, to essentially identical sequences. Specifically, degenerate codon substitutions 
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may be achieved by generating sequences in which the third position of one or more 
selected (or all) codons is substituted with mixed-base and/or deoxymosme residues 
(Batzer^/ aL Nucleic Acid Res. 19:5081 C 1 99 1 >; Ohtsuka ei ai, J. Biol. Chem. 260:2605- 
2608 (1985); Rossoiini ei ai, Mol Cell. Probes 8:91-98 (1994)). Because of the 
deseneracv of the genetic code, a large number of functionally identical nucleic acids 
encode any given protein. For instance, the codons GCA, GCC, GCG and GCU all 
encode the ammo acid alanine. Thus, at every position where an alanine is specified by a 
codon in an ammo acid herein, the codon can be altered to any of the corresponding 
codons described without altering the encoded polypeptide Such nucleic acid variations 
are "silent variations,' 1 which are one species of conservatively modified variations, 
Everv nucleic acid sequence herein which encodes a polypeptide also describes every 
possible silent variation of the nucleic acid. One of skill will recognize that each codon in 
a nucleic acid (exceot AUG, which is ordinarily the only codon for methionine, and TGG, 
which is ordinarily the only codon for tryptophan) can be modified to yield a functionally 
identical molecule. Accordingly, each silent variation of a nucleic acid which encodes a 
polypeptide is implicit in each described sequence. 

As to amino acid and nucleic acid sequences, individual substitutions, 
deletions or additions that alter, add or delete a single amino acid or nucleotide or a small 
percentage of ammo acids or nucleotides in the sequence create a "conservatively 
modified variant," where the alteration results in the substitution of an amino acid with a 
chemically similar amino acid. Conservative substitution tables providing functionally 
similar amino acids are well known in the art. Such conservatively modified variants are 
m addition to and do not exclude polymorphic variants and alleles of the invention. 

The following groups each contain ammo acids that are conservative 
substitutions for one another: 

1 ) Alanine (A), Glycine (G); 

2) Serine (S), Threonine (T); 

3) Aspartic acid (D), Glutamic acid (E); 

\:einine t K i, ... .sine t ; v Hisucme i n i 

7) lsoleucme (I), Leucine (L), Valine (V); and 

8 ) Phenylalanine ( F), Tyrosine (Y), Tryptophan (W). 
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(see. e.g., Creighton. Proteins ( 1984) for a discussion of amino acid 

properties). 

III. Design of Zinc Finger Proteins 

The chimeric zinc finger proteins of the invention comprise at least one 
zinc finger polypeptide linked via a flexible linker to at least a second DNA binding 
domain, which optionally is a second zinc finger polypeptide. The chimeric zinc finger 
protein may contain more than two DNA-binding domains, as well as one or more 
regulator domains. The zinc finger polypeptides of the invention can be engineered to 
recognize a selected target site in the gene of choice. Typically, a backbone from any 
suitable C 2 H : ZFP, such as SP- 1 , SP- 1 C, or ZIF268, is used as the scaffold for the 
engineered zinc finger polypeptides [see, e.g., Jacobs, EMBOJ. 1 1:4507 (1992); 
Desjarlais & Berg, PNAS 90:2256-2260 (1993)). A number of methods can then be used 
10 design and select a zinc finger polypeptide with high affinity for its target. A zinc 
:lnger polypeptide can be designed or selected to bind to any suitable target site in the 

target gene, with high affinity. Co-pending patent application USSN , filed 

January 12, 1999 (TTC attorney docket no. 019496-001800, herein incorporated by 
reference), comprehensively describes methods for design, construction, and expression 
of zinc finger polypeptides for selected target sites. 

Any suitable method known in the art can be used to design and construct 
nucleic acids encoding zinc finger polypeptides, e.g., phage display, random mutagenesis, 
combinatorial libraries, computer/rational design, affinity selection, PCR, cloning from 
cDNA or genomic libraries, synthetic construction and the like, (see, e.g., U.S. Pat. No. 
5,786,538, Wu et ai, PNAS 92:344-348 (1995); Jamieson et ai, Biochemistry 33:5689- 
5695 (1994); Rebar & Pabo, Science 263:671-673 (1994); Choo & Klug, PNAS 
91:11163-11167 (1994); Choo & Klug, PNAS 91 : 11168-11172 (1994); Desjarlais & 
Berg, PNAS 90:2256-2260 (1993); Desjarlais & Berg, PNAS 89:7345-7349 (1992); 
Pomerantz et ai, Science 267:93-96 (1995); Pomerantz et ai, PNAS 92:9752-9756 
(1995); and Liu et al, PNAS 94:5525-5530 (1997); Griesman & Pabo, Science 275:657- 
661 (1 997); Desjarlais & Berg, PNAS 91:11 -99- 1 1 1 03 ( 1 994)). 

In a preferred embodiment, copending application USSN , filed 

January 12, 1999 (TTC attorney docket no. 019496-001800) provides methods that select 
a target gene, and identify a target site within the gene containing one to six (or more) D- 
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able sites (see definition below). Using these methods, a zinc finger polypeptide can then 
be synthesized that binds to the preselected site. These methods of target site selection 
are premised, in part, on the recognition that the presence of one or more D-able sites in a 
target segment confers the potential for higher binding affinity in a zinc finger 
5 polypeptide selected or designed to bind to that site relative to zinc finger polypeptides 
that bind to target segments lacking D-able sites. 

A D-able site or subsite is a region of a target site that allows an 
appropriately designed single zinc finger to bind to four bases rather than three ot the 
target site. Such a zinc finger binds to a triplet of bases on one strand of a douhle- 

1 0 stranded target segment (target strand) and a fourth base on the other strand (see Figure 2 

of copending application USSN , filed January 12, 1999 (TTC attorney docket no. 

019496-001 S00). Binding of a single zinc finger to a four base target segment imposes 
constraints both on the sequence of the target strand and on the amino acid sequence of 
the zinc finger. The target site within the target strand should include the "D-able" site 

15 motifS' NNGK 3', in which N and K are conventional IUPAC-IUB ambiguity codes. A 
zinc finger for binding to such a site should include an arginine residue at position -1 and 
an aspartic acid, (or less preferably a glutamic acid) at position -2. The arginine residues 
at position -1 interacts with the G residue in the D-abie site. The aspartic acid for 
glutamic acid) residue at position +2 of the zinc finger interacts with the opposite strand 

20 base complementary to the K base in the D-able site. It is the interaction between aspartic 
acid (symbol D) and the opposite strand base (fourth base) that confers the name D-able 
site. As is apparent from the D-able site formula, there are two subtypes of D-able sites: 
5' NNGG ?' and 5' NNGT 3'. For the former site, the aspartic acid or glutamic acid at 
position -2 of a zinc finger interacts with a C in the opposite strand to the D-able site. In 

25 the latter site, the aspartic acid or glutamic acid at position ^2 of a zinc finger interacts 
with an A in the opposite strand to the D-able site. In general, NNGG is preferred over 
NNGT. 

In the design of a zinc finger polypeptide with three fingers, a target site 

should be selected in which at least one fineer of the n r ntein nnd nntinnallv ^'n or ->m 

a .aj \: f :: :^ri:ci cenc ravine :nc :ormuia . WxaVvr.V'L 



wherein 
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each of the sets (x, aj, (y, b) and (z, c) is either (N, N ) or ( G, K); 
at least one of (x, a), (>\ b) and (z, c) is (G, K). and 
N and IC are IUPAC-IUB ambiguity codes 

In other words, at least one of the three sets (x, a), (y, b) and (z. c) is the 
set (G, K), meaning that the first position of the set is G and the second position is G or T. 
Those of the three sets (if any) which are not (G, K) are (N, N), meaning that the first 
position of the set can be occupied by any nucleotide and the second position of the set 
can be occupied by any nucleotide. As an example, the set (x, a) can be (G, K) and the 
sets iy, b) and (z, c) can both be (N, N). 

In the formula 5'-NNx aNy bNzc-3\ the triplets of NNx aNy and bNzc 
represent the tnplets of bases on the target strand bound by the three fingers in a zinc 
fineer Doivnemide. If onlv one of x, y and z is a G, and this G is followed by a K, the 
targe; site includes a single D-able subshe. For example, if only x is G, and a is K, the 
sue reaas 5 -NNG KNy bNzc-3* with the D-able subsite highlighted. If both x and y but 
not z are G, and a and b are K, then the target site has two overlapping D-able subsues as 
follows: 5 -NNG KNG KSz c-3\ with one such sue being represented in bold and the 
other in italics. If all three of x, y and z are G and a, b, and c are K, then the target 
segment includes three D-able subsites, as follows 5 'NNG KNG KNG K 3\ the D-able 
subsites being represented by bold, italics and underline. 

These methods thus work by selecting a target gene, and systematically 
searching within the possible subsequences of the gene for target sues conforming to the 
formula 5'-NNx aNy bNzc-3\ as described above. In some such methods, every possible 
subsequence of 10 contiguous bases on either strand of a potential target gene is evaluated 
to determine whether it conforms to the above formula, and, if so, how many D-able sites 
are present. Typically, such a comparison is performed by computer, and a list of target 
sites conforming to the formula are output. Optionally, such target sites can be output in 
different subsets according to how many D-able sites are present. 

In a variation, the methods of the invention identify first and second target 
segments, each independently conforming to the above formula. The two target segments 
in such methods are constrained to be adjacent or proximate (i.e., within about 0-5 bases) 
of each other in the target gene. The strategy underlying selection of proximate target 
segments is to allow the design of a zinc finger polypeptide formed by linkage of two 
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component zinc finger polypeptides specific for the first and second target segments 
respectively. These pnnci pies can be extended to select target sites to be bound by zinc 
fineer polypeptides with any number of component fingers. For example, a suitable 
tareet site for a nine finger protein would have three component segments, each 
5 conforming to the above formula. 

The tareet sites identified by the above methods can be subject to further 
evaluation bv other criteria or can be used directly for design or selection (if needed) and 
production of a zinc finger polypeptide specific for such a site. A further criteria for 
evniinrtnp potential target sites is their proximity to particular regions within a gene. If a 

10 zinc finger polypeptide is to be used to repress a cellular gene on its own (i.e., without 
linking :hc zinc finger polypeptide to a repressing moiety), then the optimal location 
appears to be at. or within 50 bp upstream or downstream of the site of transcription 
initiation, to interfere with the formation of the transcription complex (Kim & Pabo, J. 
Biol. Chem 2~2:29795-296800 ( 1997)) or compete for an essential enhancer binding 

1 5 protein. If however, a zinc finger polypeptide is fused to a functional domain such as the 
KRAB repressor domain or the VP 16 activator domain, the location of the binding site is 
considerably more flexible and can be outside known regulatory regions. For example, a 
KRAB domain can repress transcription at a promoter up to at least 3 kbp from where 
KRAB is bound (Margolin et aL PN AS 9\A509-45 1 3 (1994)). Thus, target sites can be 

20 selected that do not necessarily include or overlap segments of demonstrable biological 
sienificance with target genes, such as regulatory sequences. Other criteria for further 
evaluating target segments include the prior availability of zinc finger polypeptide s 
binding to such segments or related seements, and/or ease of designing new zinc finger 
polypeptides to bind a given target segment. 

25 After a target segment has been selected, a zinc finger polypeptide that 

binds to the segment can be provided by a variety of approaches. The simplest of 
approaches is to provide a precharacterized zinc finger polypeptide from an existing 
collection that is already known to bind to the target site. However, in many instances, 



nr.uer poi v. panicles anu tncir respective rinding atliniucb A runner 
approach is to design a zinc finger polypeptide based on substitution rules as discussed 
above. A still funher alternative is to select a zinc finger polypeptide with specificity for 
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a given target by an empirical process such as phage display. In some such methods, each 
component finger of a zinc finger polypeptide is designed or selected independently of 
other component fingers. For example, each finger can be obtained from a different 
preexisting z:nc finger polypeptide or each finger can be subject to separate 
randomization and selection. 

Once a zinc finger polypeptide has been selected, designed, or otherwise 
provided to a given target segment, the zinc finger polypeptide or the DNA encoding it 
are synthesized. Exemplary methods for synthesizing and expressing DNA encoding zinc 
finger proteins are described below. The zinc finger polypeptide or a polynucleotide 
encoding it can then be used for modulation of expression, or analysis of the target gene 
containing the target site to which the zinc finger polypeptide binds. 

IV. Expression and purification of zinc finger proteins made using the methods of 
the invention 

Chimeric zinc finger proteins comprising a flexible linker and nucleic 
acids encoding such chimeric zinc finger proteins can be made using routine techniques 
in the field of recombinant genetics. Basic texts disclosing the general methods of use in 
this invention include Sambrook et al, Molecular Cloning, A Laboratory Manual (2nd 
ed. 1989); Kriegler, Gene Transfer and Expression: A Laboratory Manual (1990); and 
Current Protocols in Molecular Biology (Ausubel et a/., eds., 1994)). In addition, 
essentially any nucleic acid can be custom ordered from any of a variety of commercial 
sources. Similarly, peptides and antibodies can be custom ordered from any of a variety 
of commercial sources. 

Two alternative methods are typically used to create the coding sequences 
required to express newly designed DNA-binding polypeptides and the flexible linker. 
One protocol is a PCR-based assembly procedure that utilizes six overlapping 
oligonucleotides (to make one three finger zinc finger polypeptide). Three 
oligonucleotides correspond to "universal" sequences that encode portions of the DNA- 
binding domain between the recognition helices. These oligonucleotides remain constant 
for all zinc finger constructs. The other three "specific" oligonucleotides are designed to 
encode the recognition helices. These oligonucleotides contain substitutions primarily at 
positions -1, 2> 3 and 6 on the recognition helices making them specific for each of the 
different zinc fingers. 
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To make a three finger zinc finger polypeptide, the PCR synthesis is 
earned out in two steps. First, a double stranded DNA template is created by combining 
the six oligonucleotides (three universal, three specific) in a four cycle PCR reaction with 
a low temperature annealing step, thereby annealing the oligonucleotides to form a DNA 
"scaffold." The gaps in the scaffold are filled in by high-fidelity thermostable 
polymerase, the combination of Taq and Pfu polymerases also suffices. In the second 
phase of construction, the zinc finger template is amplified by external pnmers designed 
to incorporate restriction sues at either end for cloning into a shuttle vector or directly 
into an expression vector. 

An alternative method of cloning the newly designed DNA-binding 
proteins relies on annealing complementary oligonucleotides encoding the specific 
regions of the desired chimeric zinc finger protein. This particular application requires 
that the oligonucleotides be phosphorylated prior to the final ligation step. This is usually 
performed before setting up the annealing reactions, but kinasing can also occur post- 
annealing. In brief, the "universal" oligonucleotides encoding the constant regions of the 
proteins (olieos 1 . 2 and 3 of above) are annealed with their complementary 
oligonucleotides. Additionally, the "specific" oligonucleotides encoding the finger 
recognition helices are annealed with their respective complementary oligonucleotides. 
These complementary oligos are designed to fill in the region which was previously filled 
in bv polymerase in the protocol desenbed above. The complementary oligos to the 
common oligos 1 and finger 3 are engineered to leave overhanging sequences specific for 
the restriction sites used in cloning into the vector of choice. The second assembly 
protocol differs from the initial protocol in the following aspects: the "scaffold" encoding 
the newly designed ZFP is composed entirely of synthetic DNA thereby eliminating the 
polymerase fill-in step, additionally the fragment to be cloned into the vector does not 
require amplification. Lastly, the design of leaving sequence-specific overhangs 
eliminates the need for restriction enzyme digests of the inserting fragment. 

The resulting fragment encoding the newly designed zinc finger 
polypeptide is ligated into an expression vector. The sequences encoding the flexible 

....... ... . _ 1 i „ i . .. ...... . . * - . * * ^ L, 'L l ► l V J ^ w 4 L , . ' . IV-UJI ' , .. .i t w i. i. :'v 1 i.7 ( v. 

t niKcr is encoced dv a oligonucleotide that is ligated into the expression vector between 
the two DNA bindine domains. The second DNA binding domain can be made as 
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described above, or can be cloned or obtained from an alternative source using methods 
well known m the an, e.g., PCR and the like. Expression vectors that are commonly 
- utilized include, but are not limited to, a modified pMAL-c2 bacterial expression vector 
(New England BioLabs, "NEB") or a eukaryotic expression vector, pcDNA (Promega). 
5 The nucleic acid encoding the chimeric zinc finger protein of choice is 

typically cloned into intermediate vectors for transformation into prokaryotic or 
eukaryotic cells for replication and/or expression, e.g., for determination of K<j. 
Intermediate vectors are typically prokaryote vectors, e.g., plasmids, or shuttle vectors, or 
insect vectors, for storage or manipulation of the nucleic acid encoding zinc finger protein 

10 or production of protein. The nucleic acid encoding a zinc finger protein is also typically 
cloned into an expression vector, for administration to a plant cell, animal cell, preferably 
a mammalian cell or a human cell, fungal cell, bacterial cell, or protozoal cell. 

To obtain expression of a cloned gene or nucleic acid, a chimeric zinc 
finger protein is typically subcloned into an expression vector that contains a promoter to 

1 5 direct transcription. Suitable bacterial and eukaryotic promoters are well known in the art 
and described, e.g., in Sambrook et aL, Molecular Cloning, A Laboratory Manual (2nd 
ed. 19S9); ICriegier, Gene Transfer and Expression: A Laboratory Manual (1990); and 
Current Protocols in Molecular Biology (Ausubel et aL. eds., 1994). Bacterial expression 
systems for expressing the zinc finger protein are available in, e.g., E. colL Bacillus sp. y 

20 and Salmonella (Palva et aL, Gene 22:229-235 (1983)). Kits for such expression systems 
are commercially available. Eukaryotic expression systems for mammalian cells, yeast, 
and insect cells are well known in the art and are also commercially available. 

The promoter used to direct expression of a chimeric zinc finger protein 
nucleic acid depends on the particular application. For example, a strong constitutive 

25 promoter is typically used for expression and purification of zinc finger protein. In 

contrast, when a zinc finger protein is administered in vivo for gene regulation, either a 
constitutive or an inducible promoter is used, depending on the particular use of the zinc 
finger protein. The promoter typically can also include elements that are responsive to 
transactivation, e.g., hypoxia response elements, Gal4 response elements, lac repressor 

30 response element, and small molecule control systems such as tet-regulated systems and 
the RU-486 system (see, e.g., Gossen & Bujard, Proc. Natl. Acad. Set, U.S.A. 89:5547 
(1992); Oligino et aL, Gene Ther. 5:491-496 (1998); Wang et aL. Gene Ther. 4:432-441 
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( 199' 7 ); Neering et aL, Blood 88:1 147-1 155 (1996); and Rendahl et al. Xai. Biotechnol. 
■6:757-761 (1998)). 

In addition to the promoter, the expression vector typically contains a 
transcription unit or expression cassette that contains all the additional elements required 
5 for the expression of the nucleic acid in host cells, either prokaryotic or eukaryotic. A 
typical expression cassette thus contains a promoter operably linked, e.g.. to the nucleic 
acid sequence encoding the zinc finger protein, and signals required, e.g., for efficient 
polyadenylation of the transcript, transcriptional termination, nbosome binding sites, or 
translation termination. Additional elements of the cassette may include, e.g., enhancers, 

1 0 and heterologous spliced intronic signals. 

The particular expression vector used to transport the genetic information 
into the cell is selected with regard to the intended use of the zinc finger protein, e.g., 
expression in plants, animals, bacteria, fungus, protozoa, etc. Standard bacterial 
expression vectors include plasmids such as pBR322 based plasmids, pSKF, pET23D, 

1 5 and commercially available fusion expression systems such as GST and LacZ. A 

preferred fusion protein is the maltose binding protein, "MBP." Such fusion proteins are 
used for purification of the zinc finger protein. Epitope tags can also be added to 
recombinant proteins to provide convenient methods of isolation, for monitoring 
expression, and for monitoring cellular and subcellular localization, e.g., c-myc or FLAG. 

20 Expression vectors containing regulatory elements from eukaryotic viruses 

are often used in eukaryotic expression vectors, e.g., SV40 vectors, papilloma virus 
vectors, and vectors derived from Epstein-Barr virus. Other exemplar)' eukaryotic 
vectors include pMSG, pAV009/A+, pMTO10/A+, pMAMneo-5, baculovirus pDSVE, 
and any other vector allowing expression of proteins under the direction of the SV40 

25 early promoter, SV40 late promoter, metallothionein promoter, murine mammary tumor 
virus promoter, Rous sarcoma virus promoter, polyhedrin promoter, or other promoters 
shown effective for expression in eukaryotic cells. 

Some expression systems have markers for selection of stably transfected 
cell lines such as thymidine kinase, hygromycin B phosohotransferase. and dihvdrofolate 

:::e poiynednn promoter or other strong baculovirus promoters. 
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The elements that are typically included in expression vectors also include 
a replicon that functions in E. colL a gene encoding antibiotic resistance to permit 
selection of bactena that harbor recombinant plasmids, and unique restriction sites in 
nonessential regions of the plasmid to allow insertion of recombinant sequences. 
5 Standard transfection methods are used to produce bacterial, mammalian, 

yeast or insect cell lines that express large quantities of protein, which are then punfied 
using standard techniques (see. e.g., Colley et ai.J. Biol. Chem. 264:17619-17622 
(1989); Guide to Protein Purification, in Methods in Enzymology, vol. 182 (Deutscher, 
ed., 1990)). Transformation of eukaryotic and prokaryotic cells are performed according 

10 to standard techniques (see, e.g., Morrison, J. Bact. 122:349-351 (1977); Clark-Curtiss & 
Curtiss, Methods in Enzymology 101:347-362 (Wu et ai, eds. 1983). 

Any of the well known procedures for introducing foreign nucleotide 
sequences into host cells may be used. These include the use of calcium phosphate 
transfection, polybrene, protoplast fusion, electroporation, liposomes, microinjection, 

1 5 naked DNA, plasmid vectors, viral vectors, both episomal and integrative, and any of the 
other well known methods for introducing cloned genomic DNA, cDNA, synthetic DNA 
or other foreign genetic material into a host cell (see. e.g., Sambrook et at., supra). It is 
onlv necessary that the particular genetic engineering procedure used be capable of 
successfully introducing at least one gene into the host cell capable of expressing the 

20 protein of choice. 

.Any suitable method of protein purification known to those of skill in the 
an can be used to purify the chimeric zinc finger proteins of the invention (see Ausubel, 
supra, Sambrook, supra). In addition, any suitable host can be used, e.g., bacterial cells, 
insect cells, yeast cells, mammalian cells, and the like. 

25 In one embodiment, expression of the zinc finger protein fused to a 

maltose binding protein (MBP-zinc finger protein) in bacterial strain JMI09 allows for 
straightforward purification through an amylose column (NEB). High expression levels 
of the chimeric zinc finger protein can be obtained by induction with IPTG since the 
MBP-zinc finger protein fusion in the pMal-c2 expression plasmid is under the control of 

30 the IPTG inducible tac promoter (NEB). Bactena containing the MBP-zinc finger protein 
fusion plasmids are inoculated in to 2xYT medium containing 10^M ZnCh, 0.02% 
glucose, plus 50 \xg/m\ ampicillin and shaken at 37°C. At mid-exponential growth IPTG 
is added to 0.3 mM and the cultures are allowed to shake. After 3 hours the bacteria are 
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harvested bv centnfugation, disrupted by sonicatiom and then insoluble material is 
removed by centnfugation. The MBP-zinc finger protein proteins are captured on an 
amylose-bound resin, washed extensively with buffer containing 20 mM Tns-HCl (pH 
7 .5), 200 mM NaCl, 5 mM DTT and 50 uM ZnCl 2 , then eluted with maltose in 

5 essentially the same buffer (purification is based on a standard protocol from NEB). 
Purified proteins are quantitated and stored for biochemical analysis. 

The biochemical properties of the purified proteins, e.g., K<i, can be 
characterized by any suitable assay. In one embodiment, K<j is characterized via 
electrophoretic mobility shift assays ("EMSA") (Buratowski & Chodosh, in Current 

10 Protocols in Molecular Biology pp. 12.2.1-12.2.7 (Ausubel ed., 1996)). 



V. Regulatory domains 

The chimeric zinc finger proteins made using the methods of the invention 
can optionally be associated with regulatory domains for modulation of gene expression. 

1 5 The chimeric zinc finger protein can be covalently or non-covalently associated with one 
or more regulatory domains, alternatively two or more regulatory domains, with the two 
or more domains being two copies of the same domain, or two different domains. The 
reeulatory domains can be covalently linked to the chimeric zinc finger protein, e.g., via 
an ammo acid linker, as part of a fusion protein. The zinc finger proteins can also be 

20 associated with a regulatory domain via a non-covalent dimenzation domain, e.g., a 
leucine zipper, a STAT protein N terminal domain, or an FK506 binding protein (see. 
e.g., O'Shea, Science 254: 539 (1991), Barahmand-Pour et a/., Curr. Top. Microbiol. 
Immunol. 21 1:121-128 (1996); Klemm et aL. Annu. Rev. Immunol. 16:569-592 (1998), 
Klemm et aL, Annu. Rev. Immunol. 16:569-592 (1998); Ho et aL, Nature 382:S22-826 

25 (1996); and Pomeranz et aL, Biochem. 37:965 (1998)). The regulatory domain can be 
associated with the chimeric zinc finger protein at any suitable position, including the C- 
or N-terminus of the chimeric zinc finger protein. 

Common regulatory domains for addition to the chimeric zinc linger 
protein made using the methods of the invention include, es., heterologous DNA binding 



oncogene transcription factors (e.g., myc, jun, fos, myb, max, mad, rel. ets, Dei, myb, mos 
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family members etc.;; and chromatin associated proteins and their modifiers (e.g., 
kinases, acetviases and deacetvlases). 

Transcription factor polypeptides from which one can obtain a regulatory 
domain include those that are involved in regulated and basal transcription. Such 
5 polypeptides include transcription factors, their effector domains, coactivators, silencers, 
nuclear hormone receptors (see, e.g., Goodrich et ai, Cell 84:825-30 (1996) for a review 
of proteins and nucleic acid elements involved in transcription; transcription factors in 
general are reviewed in Bames & Adcock, Clin. Exp. Allergy 25 SuppL 2:46-9 (1995) and 
Roeder, Methods EnzymoL 273:165-71 (1996)). Databases dedicated to transcription 

10 factors are also known (see. e.g.. Science 269:630 (1995)). Nuclear hormone receptor 
transcription factors are described in, for example, Rosen et al., J. Med. Chem. 38:4855- 
"4 ( 1995). The GEBP family of transcription factors are reviewed in Wedel et ai H 
Immunohiology 193:171-85 (1995). Coactivators and co-repressors that mediate 
transcription regulation by nuclear hormone receptors are reviewed in, for example, 

15 Meier, Eur. J. Endocrinol. 134(2):158-9 (1996); Kaiser et ai, Trends Biochem. Sci. 
21:342-5 (1996); and Utley et al. Nature 394:498-502 (1998)). GATA transcription 
factors, which are involved in regulation of hematopoiesis, are described in. for example, 
Simon, /v'a/. Genet. 11:9-11 (1995); Weiss et ai, Exp. Hematoi 23:99-107. TATA box 
binding protein (TBP) and its associated TAP polypeptides (which include TAF30, 

20 TAF55, TAF80, TAF1 10, TAF150, and TAF250) are described in Goodrich & Tjian, 
Curr. Opin. Cell Biol. 6:403-9 (1994) and Hurley, Curr. Opin. Struct. Biol. 6:69-75 
(1996). The STAT family of transcription factors are reviewed in, for example, 
Barahmand-Pour etai, Curr. Top. Microbiol. Immunol. 211:121-8 (1996). Transcription 
factors involved in disease are reviewed in Aso et ai, J. Clin. Invest. 97: 1561-9 (1996). 

25 In one embodiment, the KRAB repression domain from the human KOX-1 

protein is used as a transcriptional repressor (Thiesen et al., New Biologist 2:363-374 
(1990); Margolin et al., Proc. Natl. Acad. Sci. U.S.A. 91:4509-4513 (1994); Pengue et ai. 
Nucl. Acids Res. 22:2908-2914 (1994); Witzgall et al, Proc. Natl. Acad. Sci. U.S.A. 
91:451 4-45 18 (1 994)). In another embodiment, KAP- 1 , a KRAB co-repressor, is used 

30 with KRAB (Friedman et ai. Genes Dev. 10:2067-2078 (1996)). Alternatively, KAP-1 
can be used alone with a zinc finger protein. Other preferred transcription factors and 
transcription factor domains that act as transcriptional repressors include MAD {see, e.g., 
Sommer et ai. J. Biol. Chem. 273:6632-6642 (1998); Gupta et aL Oncogene 16:1 149- 



WO 99/45132 PCT/IJS99/04441 

29 

1 159 ( 1998); Queva et aL, Oncogene 16:967-977 (1998); Larsson et aL, Oncogene 
15:737-^48 (1997); Laherty et at., Cell 89:349-356 (1997); and Cultraro ei aL, Mol Cell. 
Biol. 17:2353-2359 (19977)); FKHR (forkhead in rhapdosarcoma gene; Ginsberg et aL, 
Cancer Res. 15:3542-3546 (1998); Epstein et aL, Mol. Cell. Biol. 18:4118-4130 (1998)); 
5 EGR- 1 (early growth response gene product- 1 ; Yan et aL, Proc. Natl. Acad. Sci. U.S.A. 
95:S298-8303 (1998); and Liu et aL, Cancer Gene Titer. 5:3-28 (1998)); the ets2 
repressor factor repressor domain (ERD; Sgouras <?/ a/., EMBOJ. 14:4781-4793 
((19095\); and the MAD smSIN3 interaction domain (SID; Aver et aL, Mol. Cell. Biol. 

16:57-2-5~Sl (1996)). 
iU In one embodiment, the HSV VP 16 activation domain is used as a 

transcriptional activator (see, e.g., Hagmann et aL, J. Virol. 71:5952-5962 (1997)). Other 
preferred transcription factors that could supply activation domains include the VP64 
activation domain (Seipel et aL, EMBOJ. 11:4961-4968 (1996)); nuclear hormone 
receptors (see. e.g., Torchia et aL, Curr. Opin. Cell. Biol. 10:373-383 (1998)); the p65 
subunit of nuclear factor kappa B (Bitko & Bank, J. Virol 72:5610-5618 (1998) and 
Doyle & Hunt, Neuroreport 8:2937-2942 (1997)); and EGR-1 (early growth response 
gene product- 1 ; Yan et aL, Proc. Natl. Acad. Sci. U.S.A. 95:8298-8303 (1998); and Liu et 
aL, Cancer Gene Ther. 5:3-28 (1998)). 

Kinases, phosphatases, and other proteins that modify polypeptides 
involved in sene regulation are also useful as regulatory domains for chimeric zinc linger 
proteins Such modifiers are often involved in switching on or off transcription mediated 
by, for example, hormones. Kinases involved in transcription regulation are reviewed m 
Davis, Mot. Reprod. Dev. 42:459-67 (1995), Jackson et aL, Adv. Second Messenger 
Plwsphoprotein Res. 28:279-86 (1993), and Boulikas, Crit. Rev. Eukaryot. Gene Expr. 
25 5:1 -7" (1995), while phosphatases are reviewed in, for example, Schonthal Sc Semtn, 
Cancer Biol. 6:239-48 (1995). Nuclear tyrosine kinases are described in Wang, Trends 
Biocnem. Set. 19:373-6 (1994). 

As described, useful domains can also be obtained from the gene products 
of oncogenes (e.g., myc, jun, fos, myb, max, mad, rel, ets, bcl, myb, mos family 



1 ^ 



.vuj ul.u oaiueii rubiisners, ».o. I'he ets transcription iactors arc reviewed m 
Waslyik et aL. Eur. J. Biochem. 21 1:7-18 (1993) and Crepieux etai, Crit. Rev. Oncog. 
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5:615-38 (1994). Myc oncogenes are reviewed in, for example, Ryan et al, Biochem. J. 
214:713-21 (1996). The jur. and fos transcription factors are described in, for example. 
The Fos and Jun Families of Transcription Factors, Angel & Herrlich, eds. (1994). The 
max oncogene is reviewed in Huriin et al, Cold Spring Harb. Symp. Quant. Biol. 59:109- 
5 16. The myb gene family is reviewed in Kanei-Ishii et al, Curr Top. Microbiol 
Immunol. 21 1 :S9-98 (1996). The mos family is reviewed in Yew et al. % Curr. Opm. 
Genet. Dev. 3 : 1 9-25 ( 1993). 

In another embodiment, histone acetyltransferase is used as a 
transcriptional activator (see. e.g., Jin & Scotto. Moi Cell. Biol. 18:4377-4384 (1998); 

10 Woiffe, Science 272:371-372 (1996); Taunton et al, Science 272:408-41 1 (1996); and 
Hassi* et al, Proc. Satl Acad. Sci. U.S.A. 95:3519-3524 (1998)). In another 
embodiment, histone deacetylase is used as a transcriptional repressor (see. e.g., Jin & 
Scotto. Moi Cell. Biol. 1 8:437^-4384 (1998), Symichaki & Thireos, J. Biol Chem. 
2"3:244U-24419 (1998); Sakaguchi et al, Genes Dev. 12:2831-2841 (1998); and 

15 Martinez et aL J. Biol Chem. 273:23781-23785 (1998)). 

In addition to regulatory domains, often the chimeric zinc finger protein is 
expressed as a fusion protein such as maltose binding protein ("MBP"), glutathione S 
transferase (GST), hexahistidine, c-myc, and the FLAG epitope, for ease of purification, 
monitoring expression, or monitoring cellular and subcellular localization. 

20 

All publications and patent applications cued in this specification are 
herein incorporated by reference as if each individual publication or patent application 
were specifically and individually indicated to be incorporated by reference. 

Although the foregoing invention has been described in some detail by 
25 way of illustration and example for purposes of clarity of understanding, it will be readily 
apparent to one of ordinary skill in the art in light of the teachings of this invention that 
certain changes and modifications may be made thereto without departing from the spirit 
or scope of the appended claims. 



30 



EXAMPLES 

The following examples are provided by way of illustration only and not 
by way of limitation. Those of skill in the art will readily recognize a variety of 
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noncnticai parameters that could be changed or modified to yield essentially similar 
results. 

METHODS 

Plasmid construction . Zinc finger expression plasmids used in trans fection 
5 studies were constructed by PCR amplification of DNA segments encoding the desired 
fingers of the Zif268 peptide and/or the NRE peptide. These DNA segments were 
inserted into the Hindlll and BamHI sites of pCS, which had been constructed by 
subcioning an oligonucleotide duplex 

5'-AGCTACCATGGCCAAGGAAACCGCAGCTGCCAAAT 

10 TCGAAAGACAGCATATGGATTCTAAGCTTCGCGGATCCT-3 1 (SEQ ID NO: 1) 
S'-CTAGAGGATCCGCGAAGCTTAGAATCCATATGCTGTCT 
TTCGAATTTGGCAGCTGCGGTTTCCTTGGCCATGGT-3') (SEQ ID NO: 2) into the 
Hindlll and Xbal sites of pcDNA3 (Invitrogen). These expression plasmids were 
designed to produce zinc finger peptides with both an S-peptide tag (Kim & Raines, 

15 Protein Sa. 2:348-356(1993); Kim & Raines, i. 219:165-166 (1995)) and a nuclear 

localization signal from SV40 large T-antigen (Kalderon et ai. Cell 39:499-509 (1984)) 
at their N-termmus. Reporter plasmid were constructed by site-directed mutagenesis 
using the QuikChangeTM kit (Stratagene). Construction of the template plasmid (pGL3- 
TATATnr) for the mutagenesis was described previously (Kim & Pabo, J. Biol. Chem. 

20 272:29" 7 95-29800 (1997)). The DNA sequences of all constructs were confirmed by 
dideoxy sequencing. 

Protein production and purification . The DNA segments encoding the 
Zif268, NRE, and 268//NRE peptides were amplified by PCR and subcloned into pGEX- 
6P-3 (Pharmacia). The zinc finger proteins were expressed in E. coli as fusions with 

25 glutathione S-transferase (GST) and were purified using affinity chromatography 

according to the manufacturer's protocol. These constructs did not have an S-peptide tag 
or an S V40 nuclear localization signal. GST was subsequently removed by digestion 
with PreScissionTM Protease (Pharmacia). Protein concentrations were estimated by 
using SDS-polyacrylamide gel electrophoresis with bovine serum albumin as a standard 

. ■ . . ■ ...... ... . v ...... 

, vv-o j . r.cse ;wo metnods gave comparable results, indicating that almost aii oi the 
protein was active. 
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Gel shift assay . DNA binding reactions contained the appropriate zinc 
fmaer peptide and binding site(s) in a solution of 20 mM bis-Tris propane pK 7.0, 100 
mM NaCl, 5 mM MgCl 2 , 20 mM ZnS0 4 , 10% glycerol, 0.1% Nomdet P40, 5 mM DTT. 
and 0. 1 0 mg/mL bovine serum albumin in a total volume of 1 0 mL. All binding 
experiments were performed at room temperature. The DNA sequences of the binding 
sites follow: N site, 5 f -TCTGC AAGGGTTCA GGCGACACCAACCAA-3' (SEQ ID 
NO: 3): Z site. 5'-GTGTGTGTGTGATCT GCGTGGGCG GTAAG-3' (SEQ ID NO: 4); 
NZ site. f'-TCTGC AAGGGTTCA GCGTGGGCG GTAAG-3' (SEQ ID NO: 5); N/Z 
sue, 5-TCTGC .AAGGGTTCA G GCGTGGGCG GTAAG-3' (SEQ ID NO: 6); and N//Z 
sue, 5'-TCTGC AAGGGTTCA GT GCGTGGGCG GTAAG-3' (SEQ ID NO: 7). In each 
case, the 9-bp recognition sequences are underlined. Labeled DNAs used in gel shift 
assavs were prepared by Klenow extension or kinase reaction. 

To determine dissociation constants, 3-fold serial dilutions of the ZiG68 or 
NRE peptide were incubated with a labeled probe DNA (0.4-1.4 pM) at room temperature 
for I h. and then the reaction mixtures were subjected to gel electrophoresis. The 
radioactive signals were quantitated by phosphorimager analysis; apparent dissociation 
constants were determined as described (Rebar& Pabo, Science 263 :67 1 -673 (1994)). 

On-rates and off-rates were also determined by gel shift assay. To initiate 
the binding reaction when determining on-rate constants, a labeled probe DNA (final 
concentration, -0.4 pM) was added to the zinc finger peptide (final concentration, 5-10 
pM) at room temperature, and aiiquots were analyzed by gel electrophoresis at various 
time points (0-20 min). The fraction bound at time t was determined by phosphorimager 
anaivsis of the eels. The data were then fit (KaleidaGraph™ program (Synergy 
Software)) to the equation: 

F = Ffi M i[l - exp(-kcbs x t)] 

where F is the fraction bound at time t; F r>na i is the calculated fraction 
bound at the completion of the reaction; and ko bs is the rate constant (Hoopes et ai. J. 
Biol. Chem. 267:1 1539-1 1547 (1992)). The on-rate constant was calculated from the 
equation: 

ken = (Ffi na ! X kob S )/[P] 

where [P] is the concentration of the zinc finger protein. Off-rate 
constants were determined essentially as described (Kim et ai, Proc. Natl. Acad. Sci. 
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USA 94:3616-3620(1997)). Proteins (final concentration, 100 pM) were preincubated 
with a labeled probe DNA for I hour and then a large excess of unlabeled probe DNA 
(final concentration, 20 nM) was added. Aliquots were removed at various time points 
and analyzed by gel electrophoresis. The fraction of labeled site was normalized to the 
5 fraction found at the end of the 1 hour preincubation period. The natural log of the 

normalized fraction bound was plotted against time, and the off-rate was determined from 
the slope. Ail data points for fast on-rate and off-rate measurements were corrected for 
the electrophoresis dead time. 

Competition bindine studies . The 268//NRE peptide (final concentration. 
10 5 pM) was first incubated for 1 hour with various amounts of a cold competitor DNA (0, 
0.05, 0.5, 5. and 50 nM), and then the labeled N/Z site (6-3 pM) was added. Samples 
were analyzed by gei electrophoresis after 2, 24. 48, 96, 190, and 600 hours. Specificity 
ratios iKdc. Kd) were calculated from the equation: 

K^c/Kd = {[C]/[P],} x (F 0 x F)/(F 0 - F)(l - F) 
1 5 where K<j c is the dissociation constant for binding to the competitor DNA; 

is the dissociation constant for binding to the intact chimeric site; [C] is the 
concentration of competitor DNA; [P] t is the total concentration of the protein; F 0 is the 
fraction bound in the absence of the competitor DNA; and F is the fraction bound in the 
presence of the competitor DNA. This equation assumes that the concentration of free 
20 protein is significantly smaller than that of protein bound to DNA. This criterion should 
readily be satisfied since the Kd of the 268//NRE peptide at the N/Z site is 3.8 fM, and 5 
pM of the fusion peptide was used in these competition experiments. 

Competition experiments with salmon sperm DNA contained the 
268//NRE or Zif26S peptide (200 pM), the labeled N/Z site, and a slight molar excess of 
25 unlabeled N7Z site. Various amounts of salmon sperm DNA were added, and samples 
were analyzed by gel electrophoresis after 2, 24, and 48 hours incubation. When 
calculating specificity ratios, it was assumed that each base in the salmon sperm DNA 
represents the beginning of a potential (nonspecific) binding site. 

Transient cotransfection assay . The 293 cells were transfected bv calcium 

conilucncy in monolayer cultures (6-well plates), and the following plasmids were added; 
0.2 mg of the empty expression plasmid (pCS) or of expression plasmids encoding zinc 
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finger peptides; 0.2 mg of a reporter plasmid; I mg of activator plasmid (GAL4-VP16); 
0. i mg of (3-galactosidase expression plasmid (pCMVb; Clontech); and 2.5 mg of earner 
piasmid (pUC19). The luciferase and fi-galactosidase activities in the transfected cells 
were measured as described (Kim et ai, Proc. Natl. Acad. Sci. USA 94:3616-3620 
(1997): Kim & Pabo, J. Biol. Chem. 272:29795-29800 (1997)). All the zinc finger 
peptides expressed in 293 cells were quantitated by using the S.Tag™ Rapid Assay kit 
(Novagen) (Kim & Raines, Protein Set. 2:34S-356f 1993); Kim & Raines, Anal. Biochem. 
219:165-166 (1995)). 



RESULTS 

Structure-based design of poly-zinc finger peptides. The design strategy 
invoivec linking two three-finger peptides, using longer (noncanonical) linkers at the 
junction to avoid introducing any strain. To further reduce any nsk of interference or 
collision between the fingers, the linkers were designed so they could accommodate 
composite binding sites with one or two additional base pairs inserted between the 
individual 9-bp binding sites. Studies reported in this paper used the three-finger Zif268 
peptide (which recognizes the site 5'-GCG TGG GCG-3'; SEQ ID NO: S) and a three- 
finger "NRE M peptide (a Zif268 variant previously selected via phage display) that binds 
tightly and specifically to part of a nuclear hormone response element (5-AAG GGT 
TCA-3'; SEQ ID NO: 9) (Greisman & Pabo, Science 275:657-661 (1997)). The 
composite target site with one additional base pair at the center has the sequence 5'-AAG 
GGT TCA G GCG TGG GCG-3' (SEQ ID NO: 10) and is called the N/Z site (N denotes 
the binding site for the NRE peptide and Z the binding site for Zif268). The site with two 
additional base pairs at the center has the sequence 5'-AAG GGT TCA GT GCG TGG 
GCG-3' (SEQ ID NO: 1 1) and is called the N//Z site. 

Structure -based design, with the Zif268 complex (Pavletich & Pabo, 
Science 252:809-817 (1991); Elrod-Erickson et aL Structure 4:1 171-1 180 (1996)) as a 
model, was used to determine the appropriate length of linkers for making poly-finger 
proteins that could recognize each binding site (see Figures 1 and 2). At the N/Z site, it 
appeared that having 8 residues between the Leu at the a-helical end of the first peptide 
and the Tyr residue at the first b-sheet of the next peptide would allow sufficient 
flexibility. A canonical "TGEKJP" linker has 4 residues (i.e., Gly-Glu-Lys-Pro) in this 
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region. At the N/'/Z sue, it seemed reasonable to use 1 1 residues between the Leu and the 
Tyr (Fig. 1 A). Each linker (Fig. 1A) contained sequences that naturally flank the N- 
terminus and C-terminus of the three-Finger Zif268 peptide. To allow additional 
flexibility, a glycine was included in the shorter linker (which still is 4 residues longer 
5 than a canonical linker), and a Gly-Gly-Gly-Ser sequence was included in the longer 

linker (which is 7 residues longer than a canonical linker). Using a notation analogous to 
that for the binding sites, the fusion protein with the shorter linker is denoted as 268/NRE 
and the fusion protein with the longer linker is denoted as 268//NRE. 

Gel shift assays to determine dissociation constants and half-lives of 

1 0 protein-DN A complexes. The Zif268, NRE, and 268//NRE zinc Finger peptides were 
expressed and purified from E. coli, and used in several sets of gel shift experiments. A 
preliminary set of experiments was simply designed to determine whether two three- 
Finger proteins could bind at adjacent 9-bp sites (any interference in binding of the 
unlinked peptides could reduce the affinity of a poly-finger protein for the composite 

1 5 sites). The first experiments used a DNA fragment (referred to as the NZ site) with the 
NRE- and Zif 268-binding sites directly juxtaposed (5'-AAG GGT TCA GCG TGG 
GCG-3'; SEQ ID NO: 12). Various amounts of the NRE peptide were incubated with 
labeled NZ site in the presence or absence of Zif268 (Figure 3). It was determined that 
the three- Finger NRE peptide actually binds slightly more tightly to the NZ site with 

20 prebound Zif268 than to the free site. The apparent dissociation constant (Kd) of the NRE 
peptide is 1 80 pM when it binds alone but 60 pM when Zif268 is prebound to the 
neighboring site. Similar results were obtained at the N/Z site. These experiments prove 
that there is no collision between peptides bound at adjacent sites and suggest that there 
may even be some modest cooperative effect. It appears that previous limits in the 

25 affinity of poly-finger proteins (Rebar (Ph.D. Thesis), Selection Studies of Zinc Finger- 
DNA Recognition, Massachusetts Institute of Technology (1997); Shi, (Ph.D. Thesis), 
Molecular Mechanisms of Zinc Finger Protein-Nucleic Acid Interactions, Johns Hopkins 
University (1995); Liu et al. f Proc. Natl. Acad. Sci. USA 94:5525-5530 (1997)) were due 
to problems with linker design. 

affinity for the composite sites than for the individual 9-bp sues (Table 1 ). The fusion 
protein binds to the isolated 9-bp sites with K^s similar to those of the NRE peptide (180 
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pM) and the Zif268 peptide (14 pM) for their binding sues. In contrast, the 268//NRE 
fusion protein binds composite sites so tightly that dissociation constants are too small to 
readily be determined by protein titration. At least 0.4 pM of labeled probe DNA was 
needed in these gel shift experiments, making it difficult to accurately determine 
5 values of < 1 pM. Given these technical difficulties, it was decided to measure the on- 
rate and off-rate for binding of the 268//NRE peptide and to use these rates to estimate the 
equilibrium binding constant (Table 1). Parallel studies with the three-finger peptides 
provided useful controls. On rates for the 268//NRE, NRE, and Zif268 peptides were fast 
and were close to the diffusion-controlled limit (108 to 109 M-ls-1) (von Hippel & Berg, 

10 I. 264:675-678 (1989)). The off rates showed amazing differences: The three-finger 

peptides have half-lives of < 39 seconds, whereas the 268//NRE peptide has a half-life of 
370 hours at the NZ site. Control studies show that the 268//KRE peptide forms a much 
less stable complex with a single 9-bp site (thus the half-life = 150 seconds at the N site). 
Both the NRE fingers and the ZiC68 fingers must bind their respective 9-bp subsites to 

1 5 form the extraordinarily stable complex observed with the 268//NRE peptide at the NZ 
site. 

In all cases where parallel measurements could be performed, values 
calculated from the ratio of kinetic constants (kon/kon) were in good agreement with those 
determined from equilibrium studies (Table 1). This gave confidence in using the kinetic 

20 data to determine K^s in cases where direct titration was impracticable. Calculations 

show that the 268//NRE peptide has femtomolar affinity for the composite binding sues, 
with a IQ of 2. 1 x 1 0- 1 5 M (2. 1 fM) at the NZ site, 3 .7 fM at the N/Z site, and 3 .0 fM at 
the N//Z site (the consistency of these three K^s also is encouraging since it would be 
expected that the longer, flexible linker should readily accommodate any of these 

25 spacings). The data show that the new linker design is quite effective: the 268//NRE 

fusion peptide binds far more tightly (5,000-95,000 fold) to the composite site than to the 
individual 9-bp sites, and it binds far more tightly (6,000-90.000 fold) than either of the 
original three-finger peptides. 
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Tabic 1 

Dissociation Constants and Rate Data 



Protein 


Bincime site 




ko n .M"'s" 


. ...... 1 

KofT.S 


268//NRE 


N 


190 ±50 


2.5 ± 0.4 X 10 


4.7 x 2.9 X \0' } 


26S/7NRE 


Z 


10* 






26S//NRE 


NZ 


<1.0f 


2.5 ± 0.2 X 10 p 


5.2 ± 0.9-X 10' 7 


268//NRE 


N/Z 


<1.0t 


2.5 x 0.2 X 10 s 


9.2 - 0.7 X 10" 


268//NRE 


N//Z 


<1.0f 


2.6 = 0.6 X 10 8 


7.7 i 1.3 X 10"" 


NRE 


N/Z 


180 ±43 


>7.3 X 10 7 


>5.9 X 10* : 


Zif26S 


NZ 


12± 3 






Zif26S 


N/Z 


14±4 


>7.0X 10 K 


1.4 * 0.4 X 10' : 


Zi068 


N//Z 


14± 1 







All the constants were determined in at least two separate experiments, and the SEM is 
indicated. 

* An exact value could not be determined because this complex gave a smeared band 
on the gels. 



• ^ ir.e .v.^ site. 
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Competition experiments were also used to further study the affinity and 
specificity of the six-finger 268//NRE peptide (Figure 4A). One set of experiments 
directly tested how well the 9-bp N and Z sites could compete with the composite N/Z 
site for binding to the fusion peptide. In these experiments, various amounts of cold N or 
5 Z site were mixed with a limiting amount of the 268//NRE peptide. After 1 hour of 
incubation, a slightly molar excess (relative to the total amount of fusion protein) of 
labeled N/Z sue was added. Under these conditions, about 70% of the labeled DNA is 
shifted in the absence of competitor DNA. Samples taken at various time points were 
analvzed by gel electrophoresis. Since the 268/7NRE peptide concentration in this 
10 experiment [5 pM) is a few orders of magnitude higher than the peptide's dissociation 

constant for the N/Z site, almost all the peptide binds to the N/Z site when no competitor 
DNA is added. Anv decrease in the amount of shifted N/Z site in the presence of 
competitor DNA reflects binding of the 268//NRE peptide to the competing site. 

Equilibration in these expenments requires hundreds of hours, and the 
1 5 stability of the purified protein actually becomes a significant concern (the composite site 
is added last, and equilibration takes a long time since the fusion protein may encounter 
cold Z sites hundreds or thousands of times before it first encounters a labeled N/Z site). 
After pre-equilibration with high concentrations of cold N or Z site, it was determined 
that the fraction of N/Z label shifted increases steadily with increasing incubation times ol 
20 up to aoout 600 hours. After 600 hour of incubation, a significant fraction of the labeled 
N/Z sue is shifted even in the presence of a 10,000-fold molar excess of cold N or Z site. 
Specificity ratios (calculated as described above) indicate that the 268//NRE peptide 
prefers the composite site over the N site by a factor of at least 3,800 + 1 ,600 and that the 
fusion peptide prefers the composite site over the Z site by a factor of at least 320 + 44. 
25 These expenments directly confirm the remarkable specificity of the six-finger peptide, 
but these values are only lower bounds on the specificity ratios. The protein sample loses 
some activity during the long incubation time required by these experiments (the activity 
of the free protein has a half-life of about 2 days under these conditions), and denatured 
protein will never have a chance to shift the labeled N/Z site. 
30 Competition experiments with salmon sperm DNA were used to estimate 

the ratio of specific/nonspecific binding constants for the 268//NRE peptide (Figure 4B). 
These expenments showed that the 268//NRE peptide discriminates very effectively 
against nonspecific DNA and indicate a specificity ratio (K^s/K^) of 8.8 ^ 1.5 x 10 6 . 
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Parallel experiments with the three-finger Zif268 peptide give a specificity ratio of 1.2 - 
0.1 x 10\ Previous studies, using calf thymus DNA as a competitor and slightly different 
conditions, had given a specificity ratio of 0.31 x 10 5 for the Zif268 peptide (Greisman & 
Pabo, Science 275:657-661 (1997)). Taken together, data on the affinity and specificity 
5 of the six-finger 268//NRE fusion peptide suggested that it might serve as a very effective 
repressor and certainly indicated that it would be an excellent candidate for further 
analysis in vivo. 

Transient cotransfection studies in the 293 human cell line were used to 
see whether the new poly-finger peptides could effectively repress transcription from 

1 0 reporter genes. In a previous study, it had been shown that the Zif268 peptide could 

efficiently repress both basal and VP16-activated transcription when the Zif268 peptide 
bounc to a site near the TATA box or the initiator element (Kim & Pabo, J. Biol. Chem. 
2~ , 2:29~95-29800 (1997)). In this current study, a luciferase reporter and similar 
promoter constructs were used in which appropriate binding sites (Z, N, N/Z, and N//Z) 

1 5 were incorporated at comparable positions near the initiator element (Fig. IB). 

It was determined that the 268//NRE peptide gives 72-fold repression of 
VP16-activated transcription at a promoter containing the N/Z site and 47-fold repression 
at a promoter containing the N//Z site (Figs. 5A-5D). The 268/NRE peptide gives 68-fold 
repression at the N/Z site. Clearly, these fusion peptides are very effective repressors at 

20 sites with the appropriate spacings. Parallel experiments with the three-finger peptides 
show repression but indicate that they are considerably less effective than the fusion 
peptides. Thus the NRE peptide gives 1 .9-fold repression with an N site in the promoter; 

1 .8- fold repression with an N/Z site; 2.7-fold repression with an N7/Z site; and no 
repression with an isolated Z site. The ZiE268 peptide gives 13-fold repression from the Z 

25 promoter; S. 9-fold repression from the N/Z promoter; 1 5-fold repression from the N//Z 
promoter; and no repression with an isolated N site. Further experiments prove that 
covalent coupling is needed to achieve the much higher repression levels obtained with 
the fusion proteins at the N/Z site. 

Thus co-exr>ressin<? the 7 ; f?6R and \TRF neptid^ *s ^nnnfp no'vnennrV 

. 'j>..vtj[, .u i:;e \ . virniMiaojc » wunin experimema, error > to tnc 

8.9- folc repression obtained at this sue with the isolated Zif26S peptide. This is far less 
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than the 68-fold and 72-fold repression mat the 268/NRE and 263//NRE fusion proteins 
give at the N/Z site, and it is clear that these "synergistic" effects require covalent linkage. 

It is noted that the additional fingers in the fusion peptides may have some 
modest repressive effects even in cases where only three of the fingers can bind 
specifically. Thus the six-finger peptides (268/NRE and 26S//NRE) give 21 to 23-fold 
repression from the Z promoter. A similar (22-fold) repression level is obtained with the 
268/NRE peptide at the N//Z site. Modeling suggests that the linker is too short to allow 
specific binding of all six fingers at this site. These repression levels are consistently 
somewhat hieher than the level observed with the isolated Zif268 peptide at the Z site 
(13-fold repression). It seems possible (when the 26S//NRE peptide binds to the Z site) 
that 1 ) the NRE fingers are free and yet stencally interfere with assembly of the 
transcription complex or that 2) the NRE fingers make weak nonspecific contacts with the 
DNA and thus slightly enhance the stability of the complex. Funher studies indicate that 
all peptides are expressed at comparable levels. 

The zinc finger peptides expressed in 293 cells had an S-peptide tag, and 
the amount of peptide was quantitated by using a nbonuciease assay after activating with 
S-protem (Kim & Raines, Protein Sci. 2:348-356( 1 993); Ktm & Raines, Anal. Biochem. 
219:165-166 (1995)). A conservative estimate indicates that the expression levels of the 
peptides in cells are significantly higher (at least 100 fold) than the dissociation constants 
of the three-finger peptides. Plasmids that would encode four- and five-finger variants of 
the 268/NRE and 268//NRE peptides were also constructed. These were tested in tissue 
culture transfection studies, and they typically gave repression levels intermediate 
between those obtained with the three-finger peptides and those obtained with the six- 
finger peptides (Figs. 5 A-5D). 
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WHAT IS CLAIMED IS: 

1 1 . A method of making a chimeric zinc finger protein that binds to 

2 adjacent target sites, the method comprising the steps of: 

3 (i) selecting a first and a second DNA-binding domain polypeptide 

4 of the chimeric zinc finger protein, wherein at least one of the domains comprises a zinc 

5 finser polypeptide, and wherein the first domain binds to a first target site and the second 

6 domain binds to a second target site, which target sites are adjacent; 

7 (li) using structure-based design to determine the physical 

3 bepaiaiiu:i be:ween the first and second domains when they are individually bound to the 

9 first and second target sites; 

10 (iii) selecting a flexible linker that is at least 1-2 A longer than the 

1 1 physical separation between the first and second domains; and 

12 (iv) fusing the first and second domains with the flexible linker, 

1 3 thereby making a chimeric zinc finger protein that binds to adjacent target sites. 

1 2. The method of claim 1, wherein the flexible linker is at least five 

2 ammo acids in leneth. 

1 3. The method of claim 1. wherein the flexible linker is at least 8 

2 ammo acids in leneth. 

1 4. The method of claim 3, wherein the flexible linker has the ammo 

2 acid sequence RQKDGERP. 

1 5. The method of claim 1 , wherein the flexible linker is at least 1 1 

2 amino acids in length. 

1 6. The method of claim 5, wherein the flexible linker has the ammo 

2 acid sequence RQKDGGGSERP. 

1 ~ The method of claim 1. wherein the adiacent target sites are 



, :\\c\i\^c ^ : -aii., . . ^netem uic adjacent iaigci oiicb a: 
2 separated by at least two nucleotides. 
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9. The method of claim 1 . wherein the first and the second domains 
are zinc finger polypeptides. 

10. The method of claim 9. wherein the zinc finger polypeptide is 
selected from the group consisting of Zif268 and NRE. 

1 1 . The method of claim 9, wherein the zinc finger polypeptides are 

heterologous. 

12. The method of claim 9. wherein the chimeric zinc finger protein 
has femtomolar affinity for the adjacent target sites. 

13 The method of claim 12, wherein the chimeric zinc finger protein 

has aoout 2-4 femtomolar affinity for the adjacent targe: sites. 

14 The method of claim 1, wherein the first domain is a zinc finger 
polypeptide and the second domain composes a heterologous DNA-binding domain 
polypeptide. 

15 . The method of claim 1. wherein the chimeric zinc finger protein 
further comprises a regulatory domain polypeptide. 

16. A method of making a chimeric zinc finger protein that binds to 
adjacent target sites, the method comprising the steps of: 

(i) selecting a first and a second DNA-binding domain polypeptide 
of the chimeric zinc finger protein, wherein at least one of the domains comprises a zinc 
finger polypeptide, and wherein the first domain binds to a first target site and the second 
domain binds to a second target site, which target sites are adjacent; 

(ii) selecting a flexible linker that is five or more amino acids in 

length; and 

(iv) fusing the first and second domains with the flexible linker, 
thereby making a chimeric zinc finger protein that binds to adjacent target sites. 

1 7. The method of ciaim 16, wherein the adjacent target sites are 
separated by zero nucleotides and the flexible linker is five or six amino acids in length. 
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1 IS. The method of claim 16, wherein the adjacent target sites are 

2 separated bv one nucleotide and the flexible iinker is seven, eight, or nine amino acids in 

3 length. 

1 19. The method of claim 1 8, wherein the flexible linker has the amino 

2 acid sequence RQKDGERP. 

1 20. The method of claim 16, w herein the adjacent target sites are 

2 separated bv two nucleotides and the flexible linker is ten, eleven, or twelve amino acids 

3 in length. 

1 21. The method of claim 20, wherein the flexible linker has the amino 

2 acid sequence RQKDGGGSERP. 

1 22. The method of claim 16, wherein the adjacent target sites are 

2 separated by three nucleotides and the flexible linker is twelve or more amino acids in 

3 length. 

1 23. The method of claim 16, wherein the first and the second domains 

2 are zinc finger polypeptides. 

1 24. The method of claim 23, wherein the zinc finger polypeptide is 

2 selec;ed from the group consisting of Zif268 and NRE. 

1 25. The method of claim 23, wherein the zinc finger polypeptides are 

2 heterologous. 

1 26. The method of claim 23, wherein the chimeric zinc finger protein 

2 has femtomolar affinity for the adjacent target sites. 

1 27. The method of claim 25, wherein the chimeric zinc finger protein 

2 has about 2-4 femtomolar affinity for the adjacent target sites. 
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1 29. The method of claim 1 6, wherein the chimeric zinc finger protein 

2 further comprises a regulatory domain polypeptide. 

1 30. A chimeric zinc finger protein that binds to adjacent target sites, 

2 the chimeric zinc finger protein comprising: 

3 0) a first and a second DNA-binding domain polypeptide of the 

4 chimeric zinc finger protein, wherein at least one of the domains comprises a zinc finger 

5 polypeptide, and wherein the first domain binds to a first target site and the second 

6 domain binds to a second target site, which target sites are adjacent; and 

7 (n) a flexible linker that is at least 1 -2 A longer than the physical 

S separation between the first and second domains when they are individually bound to the 

9 first and second target sites, as determined by structure-based modeling; 
10 wherein the first and second domains are fused with the flexible linker. 



3 1 . The chimeric zinc finger protein of claim 30, wherein the flexible 
linker is at least five amino acids in length. 

32. The chimeric zinc finger protein of claim 30, wherein the flexible 

linker is at least 8 amino acids in length. 

33. The chimeric zinc finger protein of claim 32, wherein the linker has 
the amino acid sequence RQKDGERP 

34. The chimeric zinc finger protein of claim 30, wherein the flexible 
linker is at least 1 1 amino acids in length. 



1 35. The chimeric zinc finger protein of claim 34, wherein the linker has 

2 the amino acid sequence RQKDGGGSERP. 

1 36. The chimeric zinc finger protein of claim 30, wherein the first and 

2 the second domains are zinc finger polypeptides. 

1 37. The chimeric zinc finger protein of claim 36, wherein the zinc 

2 finger polypeptide is selected from the group consisting of Zif268 and NRE. 

1 38. The chimeric zinc finger protein of claim 36, wherein the zinc 

2 finger polypeptides are heterologous. 
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39 The chimenc zinc finger protein of claim 36, wherein the chimeric 
zinc finger protein has femtomolar affinity for the adjacent target sites. 

40 The chimenc zinc finger protein of claim 39, wherein the chimenc 
; zinc finger protein has about 2-4 femtomolar affinity for the adjacent target sites. 

41 . The chimeric zinc finger protein of claim 30, wherein the first 
; domain is a zinc finger polypeptide and the second domain comprises a heterologous 

3 DNA-bmding domain polypeptide. 

1 42 The chimenc zinc finger protein of claim 30, wherein the chimenc 

2 zinc finger protein further compnses a regulatory domain polypeptide. 

1 43. An isolated nucleic acid encoding the chimenc zinc finger protein 

2 of claim 30. 

1 44. A chimenc zinc finger protein that binds to adjacent target sues, 

2 the chimeric zinc finger protein comprising: 

(i) a first and a second DNA-bmding domain polypeptide of the 

4 chimenc zinc finger protein, wherein at least one of the domains compnses a zinc finger 
polypeptide, and wherein the first domain binds to a first target site and the second 
domain binds to a second target site, which target sites are adjacent; and 

(ii) a flexible linker that is five or more amino acids in length; 
wherein the first and second domains are fused with the flexible linker. 



1 45. The chimeric zinc finger protein of claim 44, wherein the adjacent 

2 target sites are separated by zero nucleotides and the flexible linker is five or six amino 

3 acids in length. 

1 46. The chimenc zinc finger protein of claim 44, wherein the adjacent 

2 target sues are separated by one nucleotide and the flexible linker is seven, eight, or nine 

3 ammo acids in length. 
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48. The chimeric zinc finger protein of claim 44, wherein the adjacent 
target sites are separated by two nucleotides and the flexible linker is ten, eleven, or 
twelve amino acids in length. 

49. The chimeric zinc finger protein of claim 48, wherein the flexible 
linker has the amino acid sequence RQKDGGGSERP. 

50. The chimeric zinc finger protein of claim 44, wherein the adjacent 
tareet sites are separated by three nucleotides and the flexible linker is twelve or more 
ammo acids in length. 

51. The chimeric zinc finger protein of claim 44, wherein the first and 
the second domains arc zinc finger polypeptides. 

52. The chimeric zinc finger protein of claim 51, wherein the zinc 
finger polypeptide is selected from the group consisting of Zif268 and NRE. 

53. The chimeric zinc finger protein of claim 51, wherein the zinc 
finger polypeptides are heterologous. 

54. The chimeric zinc finger protein of claim 51, wherein the chimeric 
zinc finger protein has femtomoiar affinity for the adjacent target sites. 

55. The chimeric zinc finger protein of claim 54, wherein the chimeric 
zinc finger protein has about 2-4 femtomoiar affinity for the adjacent target sites. 

56. The chimeric zinc finger protein of claim 44, wherein the first 
domain is a zinc finger polypeptide and the second domain comprises a heterologous 
DNA-binding domain polypeptide. 

57. The chimeric zinc finger protein of claim 44, wherein the chimeric 
zinc finger protein further comprises a regulatory domain polypeptide. 

58. An isolated nucleic acid encoding the chimeric zinc finger protein 

of claim 44. 
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FIG. IB. 
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